<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div style="font-family: Verdana;font-size: 12.0px;">
<div>
<div>Thanks, Andrew!<br/>
This is helpful.<br/>
<br/>
I wonder how to deal with multiple lines of glossing that are dependent on each other, e.g.,<br/>
<pre>Pirat-a barb-am hab-etpirate-NOM beard-ACC have-3SGNOUN-NOM NOUN-ACC VERB-3SG"The pirate has a beard"</pre>
This is a silly example, of course, but it highlights the problem: in an id eal world, I would like to be able to query for word forms that involve a morpheme with the NOUN 'pirate', i.e., utilizes the alignment within the glosses. This could be done by adding a further p-attribute that offers a set, e..,<br/>
<pre><s trans="The pirate has a beard">pirat-a pirate-NOM 3SG NOUN-NOM |pirat:pirate:NOUN|a:NOM:NOM|barb-am beard-ACC NOUN-ACC        |barb:beard:NOUN|am:ACC:ACC|hab-et have-3SG VERB-3SG |hab:have:verb|et:3SG:3SG|</s></pre>
This would allow me to easily search for, say, a morpheme 'et' that is a third person singular marker without having to specify its position in the glossed word form. I realize the third level is not very functional here, but it stands for the (real possibility) of multiple glosses that relate to each other.<br/>
<br/>
Any of these solutions is not very elegant, it seems to me - they merely succeed in making searches possible; but I cannot think of any better way.<br/>
<br/>
Best wishes, thanks again!<br/>
Ruprecht
<div style="margin: 10.0px 5.0px 5.0px 10.0px;padding: 10.0px 0 10.0px 10.0px;border-left: 2.0px solid rgb(195,217,229);">
<div style="margin: 0 0 10.0px 0;"><b>Gesendet:</b> Donnerstag, 08. Dezember 2016 um 06:50 Uhr<br/>
<b>Von:</b> "Hardie, Andrew" <a.hardie@lancaster.ac.uk><br/>
<b>An:</b> "Open source development of the Corpus WorkBench" <cwb@sslmit.unibo.it><br/>
<b>Betreff:</b> Re: [CWB] Field Word Data (ELAN)</div>
<div>Hi Ruprecht,<br/>
<br/>
>> How did you approach the representation of these levels in the CWB format<br/>
<br/>
As p-attributes for layers whose tokenisation matches the word tier. As s-attributes for those that don't.<br/>
<br/>
EG, starting with a tiered example like:<br/>
<br/>
Pirat-a barb-am hab-et<br/>
pirate-NOM beard-ACC have-3SG<br/>
"The pirate has a beard"<br/>
<br/>
(in whatever underlying format...)<br/>
<br/>
... I flip it horizontal -> vertical to the following CWB input file (cols separated by tabs as usual)<br/>
<br/>
<s trans="The pirate has a beard"><br/>
pirat-a pirate-NOM<br/>
barb-am beard-ACC<br/>
hab-et have-3SG<br/>
</s><br/>
<br/>
If there are multiple layers of glossing, then I just add more p-attributes.<br/>
<br/>
In CQPweb I set the morpheme-gloss as the primary annotation, so that it can be searched like a tag in the Simple Query language (CEQL). EG _*-NOM to find all nominatives.<br/>
<br/>
>> how did you end up displaying the output?<br/>
<br/>
I have added a special "field mode" to CQPweb for corpora like this. It switches the concordance display to a mode which re-builds the familiar 3-line-example format.<br/>
<br/>
See attached screenshot (from a small Bodo corpus).<br/>
<br/>
Field mode is, alas, not as well documented in the sysadmin manual as it ought to be... and it's not fully implemented for the extended-context view.<br/>
<br/>
best<br/>
<br/>
Andrew.<br/>
<br/>
<br/>
-----Original Message-----<br/>
From: cwb-bounces@sslmit.unibo.it [mailto:cwb-bounces@sslmit.unibo.it] On Behalf Of Ruprecht von Waldenfels<br/>
Sent: 07 December 2016 12:27<br/>
To: Open source development of the Corpus WorkBench<br/>
Subject: [CWB] Field Word Data (ELAN)<br/>
<br/>
<br/>
Hi,<br/>
<br/>
I was wondering about projects that use CWB to display field work data,<br/>
i.e., text with (multiple levels of) morpheme-level glossing. Could you<br/>
share your experiences? How did you approach the representation of these<br/>
levels in the CWB format, how did you end up displaying the output?<br/>
<br/>
I am planning to adapt our current spoken-data interface<br/>
(parasolcorpus.org/Pushkino) to handle glossed data and will write a<br/>
converter from the ELAN format to handle this. I would greatly<br/>
appreciate any comments on how this is best done, how to handle the<br/>
display, and whether there are any projects that already do this.<br/>
<br/>
Best wishes!<br/>
Ruprecht<br/>
<br/>
_______________________________________________<br/>
CWB mailing list<br/>
CWB@sslmit.unibo.it<br/>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" target="_blank">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br/>
_______________________________________________<br/>
CWB mailing list<br/>
CWB@sslmit.unibo.it<br/>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" target="_blank">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a></div>
</div>
</div>
</div>
</div>
<div> </div>
<div class="signature">Ruprecht von Waldenfels<br/>
Sedanstr. 3<br/>
D-93055 Regensburg<br/>
<br/>
Tel. +49 941 78 03 115<br/>
Mob. +49 163 230 34 23<br/>
Skype: rvwaldenfels</div></div></body></html>