<div dir="ltr">Oh, thank you Andrew! "Manage annotations" menu helped :)<br></div><div class="gmail_extra"><br><div class="gmail_quote">On 9 March 2018 at 09:59, Hardie, Andrew <span dir="ltr"><<a href="mailto:a.hardie@lancaster.ac.uk" target="_blank">a.hardie@lancaster.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div link="blue" vlink="purple" lang="EN-GB">
<div class="m_-1132865526339610344WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">Have you configured the pos attribute as your primary annotation? (either by setting it as such when indexing, or via the “Manage annotation”
controls)?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">“Show tags” displays the primary annotation, but the system needs to know which that is in order to do so.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">Andrew.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif" lang="EN-US">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif" lang="EN-US"> <a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.unibo.it</a> [mailto:<a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.<wbr>unibo.it</a>]
<b>On Behalf Of </b>mansur<br>
<b>Sent:</b> 09 March 2018 06:48<br>
<b>To:</b> Open source development of the Corpus WorkBench <<a href="mailto:cwb@sslmit.unibo.it" target="_blank">cwb@sslmit.unibo.it</a>><br>
<b>Subject:</b> Re: [CWB] Escape "<" and ">" symbols<u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">Hello!<u></u><u></u></p>
</div>
<p class="MsoNormal">According to your advice I'm using tags like this:<u></u><u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">n:nom:pl<u></u><u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">But when I press "Show tags" in the concordance, it does not show tags anyway:<br>
Нурый_ -_ Биктимернең_ энесе_ ,_ комсомол_ ячейкасы_ секретаре_ ._ <br>
Мөршидә_ -_ Нурыйның_ йөри_ торгам_ кызы_ ._ <br>
Әпрэй_ -_ ялкау_ ._<u></u><u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">Maybe I need to configure it somewhere?<u></u><u></u></p>
</div>
<p class="MsoNormal">Columns in my vrt file:<u></u><u></u></p>
</div>
<p class="MsoNormal">word<u></u><u></u></p>
</div>
<p class="MsoNormal">lemma<u></u><u></u></p>
</div>
<p class="MsoNormal">pos<u></u><u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">tags<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Thank you!<u></u><u></u></p>
</div>
<p class="MsoNormal">With best wishes,<u></u><u></u></p>
</div>
<p class="MsoNormal">Mansur<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">On 5 March 2018 at 15:53, Hardie, Andrew <<a href="mailto:a.hardie@lancaster.ac.uk" target="_blank">a.hardie@lancaster.ac.uk</a>> wrote:<u></u><u></u></p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">If you use | then you can treat the attribute as a feature set. This might be useful. You can see a
description of what feature sets allow you to do in the encoding tutorial. </span>
<u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">If you don’t care about it being a feature set, then you can use any character. People often do use
: as a joiner, but there is no reason not to use ; or , instead if that makes more sense for your purposes. I’d suggest not using . because it is a regular expression metacharacter.</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">best</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">Andrew.</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif" lang="EN-US">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif" lang="EN-US">
<a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.unibo.it</a> [mailto:<a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.<wbr>unibo.it</a>]
<b>On Behalf Of </b>mansur<br>
<b>Sent:</b> 05 March 2018 12:46<br>
<b>To:</b> Open source development of the Corpus WorkBench <<a href="mailto:cwb@sslmit.unibo.it" target="_blank">cwb@sslmit.unibo.it</a>><br>
<b>Subject:</b> Re: [CWB] Escape "<" and ">" symbols</span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">Hello, Stefan, Andrew and others!!!<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">You advised to use tagging style like:<br>
<br>
n:sg:px3sp:nom<br>
or<br>
n|sg|px3sp|nom<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Is there any particular reason why ":" or "|" instead of "<" or ">". Is it possible to use "," (comma)? What do you usually use in your projects?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Thank you!<u></u><u></u></p>
</div>
<p class="MsoNormal">With best wishes,<u></u><u></u></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt">Mansur<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">On 22 February 2018 at 11:52, Stefan Evert <<a href="mailto:stefanML@collocations.de" target="_blank">stefanML@collocations.de</a>> wrote:<u></u><u></u></p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal" style="margin-left:4.8pt">
Dear Mansur,<br>
<br>
most of the remaining issues are related to CQPweb, so Andrew will be in a much better position to answer them and help you with the debugging. Some of them are clearly (mis-)configuration issues, e.g. the failure to locate the CEQL backend that is part of
CQPweb or the failure to run CQP.<br>
<br>
Are you working with an up-to-date version of CQPweb checked out from the SVN repository?<br>
<br>
<br>
> 3) After rebooting computer any search does not work at all:<br>
> ERROR: CQP backend startup failed; the reported CQP version [] could not be parsed.<br>
> But from the comman line I can perform search with 'cqp -e' and it seems to be working, at least I can see search results.<br>
<br>
This suggests that you have CQP installed, but in a "private" path that's only visible to your user account and not to the Web server running CQPweb. You may also need to configure CQPweb and set appropriate paths there.<br>
<br>
> 4) Is it possible to choose ranges of periods in search according to the 'date'?<br>
> <text id="" date=?????><br>
<br>
I think Andrew is working on support for date attributes in CQPweb.<br>
<br>
In plain CQP, there are two ways of doing date searches:<br>
<br>
a) The reasonable way: Store your dates in a simple standard format – I prefer ISO YYYY-MM-DD, so alphabetical and chronological sort order are the same – and then construct regular expressions for your suitable date ranges, e.g. in the global constraint of
a CQP query:<br>
<br>
… :: match.text_date = "2011-03.*"; # anything in March 2011<br>
<br>
… :: match.text_date = "1990-(01-(1[2-9]|[23]\d)|02-.<wbr>*|03-([0-1]\d|2[0-4]))"; # 12 Jan 1990 .. 24 Mar 1990<br>
<br>
b) The "I'm a Unix hacker way": convert your dates to 32-bit integers and use numeric comparisons. The obvious choice would be consecutive numbers for days (or even seconds as in Unix timestamps), but conversion from/to human-readable dates will be complicated.
However, you could encode the ISO-format above _without_ the hyphens to get 8-digit numbers, e.g.<br>
<br>
<text id="…" date="20180222"><br>
<br>
and then cast to integers for numerical comparisons:<br>
<br>
… :: int(match.text_date) >= 19900112 & int(match.text_date) <= 19900324;<br>
<br>
Nice trick, isn't it?<br>
<br>
> 5) When I press 'Show tags' button I get<br>
> 2012_ нче_ елда_ республикада_ 55_ мең_ 839_ бала_ дөньяга_ килгән_ ._<br>
> but no tags.<br>
<br>
That's because CQPweb failed to do proper HTML-escaping for the annotation strings (which is not only incovenient but also a security risk).<br>
<br>
@Andrew: has this bug been fixed in the lastest CQPweb code?<br>
<br>
I've been bitten by similar issues before and would recommend avoiding HTML metacharacters (and other funny things) in annotation strings. Better recode to something like<br>
<br>
n:sg:px3sp:nom<br>
<br>
or even<br>
<br>
|n|sg|px3sp|nom|<br>
<br>
so you can use the "contains" operator in searches.<br>
<br>
> I think it is maybe because I didn't replace "<" and ">" in my morphological tags to their XML entities yet. Please, correct me if I'm wrong.<br>
<br>
That won't help! With -x, cwb-encode will decode the XML entities in your input file and you'll end up with < and > in the indexed corpus. You could encode without the -x flag, but then your annotation strings will be<br>
<br>
&lt;n&gt;&lt;sg&gt;&lt;px3sp&<wbr>gt;&lt;nom&gt;<br>
<br>
which happens to display nicely only until HTML escaping in CQPweb is fixed – and you will have to search for<br>
<br>
[pos = ".*&lt;nom&gt;.*"]<br>
<br>
instead of<br>
<br>
[pos = ".*<nom>.*"]<br>
<br>
> 7) I also saw the button 'Export corpus -> Export whole corpus'. Does that mean that users can download the whole corpus? Is it possible to turn it off somehow?<br>
<br>
AFAIK, only users with the "full access privilege" are allowed to download a corpus. So if you want to disable downloads, simply keep to "normal access".<br>
<br>
<br>
Best,<br>
Stefan<br>
<br>
______________________________<wbr>_________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it" target="_blank">CWB@sslmit.unibo.it</a><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" target="_blank">http://liste.sslmit.unibo.it/<wbr>mailman/listinfo/cwb</a><u></u><u></u></p>
</blockquote>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
______________________________<wbr>_________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it" target="_blank">CWB@sslmit.unibo.it</a><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" target="_blank">http://liste.sslmit.unibo.it/<wbr>mailman/listinfo/cwb</a><u></u><u></u></p>
</blockquote>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
</div>
<br>______________________________<wbr>_________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" rel="noreferrer" target="_blank">http://liste.sslmit.unibo.it/<wbr>mailman/listinfo/cwb</a><br>
<br></blockquote></div><br></div>