How to import a corpus from EXMARaLDA

We will import an example corpus from the ANNIS demo corpus page, the so-called "dialog.demo" corpus, a sample from the BeMaTaC corpus.

  1. Go to https://corpus-tools.org/annis/corpora.html.
  2. Download the corpus named "dialog.demo" in the EXMARaLDA format.
  3. Unzip the file to a folder of your choice
  4. The folder will have a structure with the root corpus as folder and a single .exb file containing a document. The video file next to it will be linked with the document, but Hexatomic can not yet play video files.
dialog.demo/
├── dialog.demo.exb
└── dialog.demo.webm
  1. Choose the Import entry in the File menu.
  2. Click on the button with the ... caption and navigate to the unzipped dialog.demo folder. Then click on Next. Select a corpus folder in the import wizard
  3. The importer should correctly identify this corpus as "EXMARALDA format (*.exb)". Click on Finish to import the corpus. Format selection wizard step
  4. Expand the corpus structure in the "Corpus Structure" view and right-click on the "dialog.demo" document, select "Open with Grid Editor". Choose "phon0" as data source to show the token and span annotations for the first speaker. Grid editor with the openend document
  5. You can now continue to work with the imported data in the Grid Editor.