How to import a corpus from EXMARaLDA

We will import an example corpus from the ANNIS demo corpus page, namely the so-called “dialog.demo” corpus, a sample from the BeMaTaC corpus.

  1. Go to https://corpus-tools.org/annis/corpora.html.
  2. Download the corpus named “dialog.demo” in the EXMARaLDA format.
  3. Unzip the file to a folder of your choice
  4. The folder will have a structure with the root corpus as folder and a single .exb file containing a document. The video file next to it will be linked with the document, but Hexatomic does not allow playing video files yet.
dialog.demo/
├── dialog.demo.exb
└── dialog.demo.webm
  1. Choose the Import entry in the File menu.
  2. Click on the button with the ... caption and navigate to the unzipped dialog.demo folder. Then click on Next. Select a corpus folder in the import wizard
  3. The importer should correctly identify this corpus as “EXMARALDA format (*.exb)”. Click on Finish to import the corpus. Format selection wizard step
  4. Unfold the corpus and in the “Corpus Structure” and right-click on the “dialog.demo” document, select “Open with Grid Editor”. Choose “phon0” as data source to show the token and span annotations for the first speaker. Grid editor with the openend document