Import and annotate an example corpus
We will import an example corpus from the ANNIS demo corpus page, namely the so-called “pcc2” corpus, a sample from the Potsdam Commentary Corpus. It contains several annotation layers, like constituent trees, dependency trees and annotation for information structure.
- Go to https://corpus-tools.org/annis/corpora.html.
- Download the corpus named “pcc2” in the PAULA format.
- Unzip the file to a folder of your choice
- Choose the Import entry in the File menu.
- Click on the button with the ... caption and navigate to the unzipped
pcc2_v6_PAULA
folder. Then click on Next. - The importer should correctly identify this corpus as “PAULA format”. Click on Finish to import the corpus.
- Unfold the corpus and in the “Corpus Structure” and right-click on the “4282” document, select “Open with Graph Editor”.
- This shows the whole document as a graph, but we are only interested in the constituent tree for now. Expand “Annotation Types” and “Node annotations” in the Filter View. Unselect “Pointing Relations”.
Then type
cat
into the “Search” field and click on thetiger::cat
filter badge. This still shows the whole document, but now we can select the segments we are interested in. Click on the first three segments while holding the Ctrl key. - Add new annotations using the console. To add a root node connecting the trailing token “!” with the sentence constituent node, enter
n tiger:cat:ROOT
in the console. Then double-click on thetok_7
node and again on theconst_2
node. This should complete the prompt in the console ton tiger:cat:ROOT #tok_7 #const_2
. With the cursor active at the end of the prompt, press Enter. Note that the segmentation changes because we connected the previously separate segments.
- Save the project via by clicking on the File menu and then Save Salt Project As... to persist the changes as a project.