<div dir="ltr"><div dir="ltr"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr">Thank you, it works perfectly now. <br>In my first try without the close tag I had kept the :0 encoding and it only found the first occurrence of <pause>.<br><br>Thank you so much for your help,<br>Stefania</div><div dir="ltr"><br><div>---</div><div><b>Prof. Stefania Spina</b><br>Università per Stranieri di Perugia<br>Delegata alla Ricerca </div><div><a href="mailto:stefania.spina@unistrapg.it" target="_blank">stefania.spina@unistrapg.it</a><br><a href="https://unistrapg.academia.edu/StefaniaSpina" target="_blank">https://www.researchgate.net/profile/Stefania_Spina2</a><br><br></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno mar 12 mag 2020 alle ore 08:58 Hardie, Andrew <<a href="mailto:a.hardie@lancaster.ac.uk">a.hardie@lancaster.ac.uk</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">>>>>Do you remember whether cwb-encode will read <pause dur=short/> as an open tag?<br>
<br>
Yes. Line 1886ff:<br>
<br>
if (buf[j-1] == '/') {<br>
j--; /* empty tag: remove "/" from annotation string and handle as an open tag */<br>
/* Note that this implicitly closes the previous instance of the empty tag:<br>
* - this means that we can work with empty elements by looking just at the "open-point" of each range;<br>
* - it also means that empty tags with metadata at the start of each text will automatically extend over the full text.<br>
* However, the approach sketched here only works with "flat" s-attributes declared without recursion (even without :0). */<br>
}<br>
<br>
>>>> Perhaps worth a new flag (-E pause+dur) which only accepts empty elements and doesn't allow nesting?<br>
<br>
I think that sort of additional complication should be left till v4, no?<br>
<br>
Andrew.<br>
<br>
-----Original Message-----<br>
From: <a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.unibo.it</a> <<a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.unibo.it</a>> On Behalf Of Stefan Evert<br>
Sent: 12 May 2020 07:22<br>
To: CWBdev Mailing List <<a href="mailto:cwb@sslmit.unibo.it" target="_blank">cwb@sslmit.unibo.it</a>><br>
Subject: Re: [CWB] empty element<br>
<br>
<br>
> On 11 May 2020, at 23:04, Stefania Spina <<a href="mailto:stefania.spina@unistrapg.it" target="_blank">stefania.spina@unistrapg.it</a>> wrote:<br>
><br>
> Thank you Stefan and Andrew!<br>
> And will it also work if <pause> has an attribute?<br>
> <pause dur="short"></pause><br>
<br>
Yes. In the BNCweb solution, you would have to search the tags_before attribute for the full shape of the XML tag, e.g.<br>
<br>
[tags_before = '.*<pause\s*dur="short">.*']<br>
<br>
which becomes much more complicated if there could be multiple attributes in <pause> in different order. For the feature set, we devised a special encoding, e.g. something like<br>
<br>
|pause|pause_dur=short|<br>
<br>
that you can search with<br>
<br>
[tags_before contains "pause_dur=short"]<br>
<br>
In Andrew's solution, which I like better, you'll have to make sure to (i) remove any close tags (so the range extends to the next <pause> item) and (ii) encode with declaration<br>
<br>
-S pause+dur<br>
<br>
Omitting a nesting specifier (such as ":0") ensures that ranges are automatically closed when the next open tag is encountered.<br>
<br>
@Andrew: Do you remember whether cwb-encode will read<br>
<br>
<pause dur=short/><br>
<br>
as an open tag? Perhaps worth a new flag (-E pause+dur) which only accepts empty elements and doesn't allow nesting?<br>
<br>
Best,<br>
Stefan<br>
<br>
_______________________________________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it" target="_blank">CWB@sslmit.unibo.it</a><br>
<a href="https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fliste.sslmit.unibo.it%2Fmailman%2Flistinfo%2Fcwb&amp;data=02%7C01%7Ca.hardie%40lancaster.ac.uk%7C4968de9cadcc4dc3292d08d7f63ce27b%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C1%7C637248613674777211&amp;sdata=1g83Cyz45sFsDq8YCF%2F6Zc1XJvFYEMKNtZ8zmfCPI0U%3D&amp;reserved=0" rel="noreferrer" target="_blank">https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fliste.sslmit.unibo.it%2Fmailman%2Flistinfo%2Fcwb&amp;data=02%7C01%7Ca.hardie%40lancaster.ac.uk%7C4968de9cadcc4dc3292d08d7f63ce27b%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C1%7C637248613674777211&amp;sdata=1g83Cyz45sFsDq8YCF%2F6Zc1XJvFYEMKNtZ8zmfCPI0U%3D&amp;reserved=0</a><br>
_______________________________________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it" target="_blank">CWB@sslmit.unibo.it</a><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" rel="noreferrer" target="_blank">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br>
</blockquote></div>