[CWB] Field Word Data (ELAN)
Hardie, Andrew
a.hardie at lancaster.ac.uk
Thu Dec 8 06:50:01 CET 2016
Hi Ruprecht,
>> How did you approach the representation of these levels in the CWB format
As p-attributes for layers whose tokenisation matches the word tier. As s-attributes for those that don't.
EG, starting with a tiered example like:
Pirat-a barb-am hab-et
pirate-NOM beard-ACC have-3SG
"The pirate has a beard"
(in whatever underlying format...)
... I flip it horizontal -> vertical to the following CWB input file (cols separated by tabs as usual)
<s trans="The pirate has a beard">
pirat-a pirate-NOM
barb-am beard-ACC
hab-et have-3SG
</s>
If there are multiple layers of glossing, then I just add more p-attributes.
In CQPweb I set the morpheme-gloss as the primary annotation, so that it can be searched like a tag in the Simple Query language (CEQL). EG _*-NOM to find all nominatives.
>> how did you end up displaying the output?
I have added a special "field mode" to CQPweb for corpora like this. It switches the concordance display to a mode which re-builds the familiar 3-line-example format.
See attached screenshot (from a small Bodo corpus).
Field mode is, alas, not as well documented in the sysadmin manual as it ought to be... and it's not fully implemented for the extended-context view.
best
Andrew.
-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Ruprecht von Waldenfels
Sent: 07 December 2016 12:27
To: Open source development of the Corpus WorkBench
Subject: [CWB] Field Word Data (ELAN)
Hi,
I was wondering about projects that use CWB to display field work data,
i.e., text with (multiple levels of) morpheme-level glossing. Could you
share your experiences? How did you approach the representation of these
levels in the CWB format, how did you end up displaying the output?
I am planning to adapt our current spoken-data interface
(parasolcorpus.org/Pushkino) to handle glossed data and will write a
converter from the ELAN format to handle this. I would greatly
appreciate any comments on how this is best done, how to handle the
display, and whether there are any projects that already do this.
Best wishes!
Ruprecht
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://liste.sslmit.unibo.it/mailman/listinfo/cwb
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bodo.png
Type: image/png
Size: 26141 bytes
Desc: bodo.png
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20161208/4852750e/attachment-0001.png>
More information about the CWB
mailing list