<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div style="font-family: Verdana;font-size: 12.0px;">

<div>&nbsp;

<div>Thanks, Andrew!<br/>

This is helpful.<br/>

<br/>

I wonder how to deal with multiple lines of glossing that are dependent on each other, e.g.,<br/>

&nbsp;

<pre>Pirat-a    barb-am   hab-etpirate-NOM beard-ACC have-3SGNOUN-NOM NOUN-ACC VERB-3SG&quot;The pirate has a beard&quot;</pre>

This is a silly example, of course, but it highlights the problem: in an id eal world, I would like to be able to query for word forms that involve a morpheme with the NOUN &#39;pirate&#39;, i.e., utilizes the alignment within the glosses. This could be done by adding a further p-attribute that offers a set, e..,<br/>

&nbsp;

<pre>&lt;s trans=&quot;The pirate has a beard&quot;&gt;pirat-a  pirate-NOM 3SG NOUN-NOM &#124;pirat:pirate:NOUN&#124;a:NOM:NOM&#124;barb-am  beard-ACC NOUN-ACC        &#124;barb:beard:NOUN&#124;am:ACC:ACC&#124;hab-et   have-3SG VERB-3SG      &#124;hab:have:verb&#124;et:3SG:3SG&#124;&lt;/s&gt;</pre>

This would allow me to easily search for, say, a morpheme &#39;et&#39; that is a third person singular marker without having to specify its position in the glossed word form. I realize the third level is not very functional here, but it stands for the (real possibility) of multiple glosses that relate to each other.<br/>

<br/>

Any of these solutions is not very elegant, it seems to me - they merely succeed in making searches possible; but I cannot think of any better way.<br/>

<br/>

Best wishes, thanks again!<br/>

Ruprecht

<div style="margin: 10.0px 5.0px 5.0px 10.0px;padding: 10.0px 0 10.0px 10.0px;border-left: 2.0px solid rgb(195,217,229);">

<div style="margin: 0 0 10.0px 0;"><b>Gesendet:</b>&nbsp;Donnerstag, 08. Dezember 2016 um 06:50 Uhr<br/>

<b>Von:</b>&nbsp;&quot;Hardie, Andrew&quot; &lt;a.hardie@lancaster.ac.uk&gt;<br/>

<b>An:</b>&nbsp;&quot;Open source development of the Corpus WorkBench&quot; &lt;cwb@sslmit.unibo.it&gt;<br/>

<b>Betreff:</b>&nbsp;Re: [CWB] Field Word Data (ELAN)</div>


<div>Hi Ruprecht,<br/>

<br/>

&gt;&gt; How did you approach the representation of these levels in the CWB format<br/>

<br/>

As p-attributes for layers whose tokenisation matches the word tier. As s-attributes for those that don&#39;t.<br/>

<br/>

EG, starting with a tiered example like:<br/>

<br/>

Pirat-a barb-am hab-et<br/>

pirate-NOM beard-ACC have-3SG<br/>

&quot;The pirate has a beard&quot;<br/>

<br/>

(in whatever underlying format...)<br/>

<br/>

... I flip it horizontal -&gt; vertical to the following CWB input file (cols separated by tabs as usual)<br/>

<br/>

&lt;s trans=&quot;The pirate has a beard&quot;&gt;<br/>

pirat-a pirate-NOM<br/>

barb-am beard-ACC<br/>

hab-et have-3SG<br/>

&lt;/s&gt;<br/>

<br/>

If there are multiple layers of glossing, then I just add more p-attributes.<br/>

<br/>

In CQPweb I set the morpheme-gloss as the primary annotation, so that it can be searched like a tag in the Simple Query language (CEQL). EG _*-NOM to find all nominatives.<br/>

<br/>

&gt;&gt; how did you end up displaying the output?<br/>

<br/>

I have added a special &quot;field mode&quot; to CQPweb for corpora like this. It switches the concordance display to a mode which re-builds the familiar 3-line-example format.<br/>

<br/>

See attached screenshot (from a small Bodo corpus).<br/>

<br/>

Field mode is, alas, not as well documented in the sysadmin manual as it ought to be... and it&#39;s not fully implemented for the extended-context view.<br/>

<br/>

best<br/>

<br/>

Andrew.<br/>

<br/>

<br/>

-----Original Message-----<br/>

From: cwb-bounces@sslmit.unibo.it [mailto:cwb-bounces@sslmit.unibo.it] On Behalf Of Ruprecht von Waldenfels<br/>

Sent: 07 December 2016 12:27<br/>

To: Open source development of the Corpus WorkBench<br/>

Subject: [CWB] Field Word Data (ELAN)<br/>

<br/>

<br/>

Hi,<br/>

<br/>

I was wondering about projects that use CWB to display field work data,<br/>

i.e., text with (multiple levels of) morpheme-level glossing. Could you<br/>

share your experiences? How did you approach the representation of these<br/>

levels in the CWB format, how did you end up displaying the output?<br/>

<br/>

I am planning to adapt our current spoken-data interface<br/>

(parasolcorpus.org/Pushkino) to handle glossed data and will write a<br/>

converter from the ELAN format to handle this. I would greatly<br/>

appreciate any comments on how this is best done, how to handle the<br/>

display, and whether there are any projects that already do this.<br/>

<br/>

Best wishes!<br/>

Ruprecht<br/>

<br/>

_______________________________________________<br/>

CWB mailing list<br/>

CWB@sslmit.unibo.it<br/>

<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" target="_blank">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br/>

_______________________________________________<br/>

CWB mailing list<br/>

CWB@sslmit.unibo.it<br/>

<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" target="_blank">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a></div>

</div>

</div>

</div>

</div>


<div>&nbsp;</div>


<div class="signature">Ruprecht von Waldenfels<br/>

Sedanstr. 3<br/>

D-93055 Regensburg<br/>

<br/>

Tel. +49 941 78 03 115<br/>

Mob. +49 163 230 34 23<br/>

Skype: rvwaldenfels</div></div></body></html>