<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Aptos;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        font-size:12.0pt;
        font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
span.hyphen
        {mso-style-name:hyphen;}
span.EmailStyle22
        {mso-style-type:personal-reply;
        font-family:"Verdana",sans-serif;
        color:#156082;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;
        mso-ligatures:none;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Hi Chelsey,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Apologies, I overlooked this one because it wasn’t under the same subject line (perils of replying to a list digest I’m afraid.)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">If you are getting <b>"ora[i,]n" %cd</b> in Query history, that looks very much like a simple query parser misconfiguration. You
will need to contact the server administrator at Glasgow (I’m unaware who that is) to resolve the issue.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">For the record, if you use the following in simple query / ignore case<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"> ora[i,]n<o:p></o:p></span></b></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">then that is equivalent to the following in CQP syntax<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">
<b>"orai?n"%cd<o:p></o:p></b></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Therefore, if simple query is not working, you can switch to CQP syntax and use the above query string to get accented and unaccented
forms with or without the I.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">FYI: as you suspected, the
<b>%cd</b> means case-insensitive, diacritic-insensitive – but <i>only</i> in CQP syntax, not Simple Query. IE it is the consequence of using a case insensitive mode query (diacritic sensitivity defaults to “off” without you doing anything in Simple Query).
The presence of <b>%cd</b> alongside the Simple query <b>"ora[i,]n"</b> in your QH is what suggests the simple query is not being parsed correctly by the system.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Best<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Andrew.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm;font-size:pt">
<p class="MsoNormal"><b><span style="font-family:"Calibri",sans-serif">From:</span></b><span style="font-family:"Calibri",sans-serif"> cwb-bounces@sslmit.unibo.it <cwb-bounces@sslmit.unibo.it>
<b>On Behalf Of </b>Chelsey MacPherson<br>
<b>Sent:</b> 19 July 2024 14:22<br>
<b>To:</b> cwb@sslmit.unibo.it<br>
<b>Subject:</b> [Re: [CWB] CWB Digest, Vol 207, Issue 7<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal"><span style="color:black">Thank you everyone for your replies!<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">Query issue: Corpus na Gàidhlig, CQPweb, simple query (ignore case)<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">I see now that the issue persists when I select "accent insensitive". Then the query "ora[i,]n" displays as this in the query history: ""ora[i,]n" %cd". I don't understand the "%cd" symbol. Is it ignore case? Either
way, the accent insensitive selection affects the query and I'll only get results with variants of "orain". As a work around I did this "<a href="https://dasg.arts.gla.ac.uk/CQPweb/dasg/index.php?ui=search&insertString=%5B%C3%B2ra%2C%C3%B2rai%2Cora%2Corai%2C%C3%B3ra%2C%C3%B3rai%2Camhra%2Camhrai%5Dn&insertType=sq_nocase">[òra,òrai,ora,orai,óra,órai]n</a>"
and was able to get accented versions or "oran" and "orain".<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">I don't know enough to verify the preprocessing or indexing question. I see in the corpus metadata that there is no word-level annotation and the STTR is not cached for the tokens<span class="hyphen">—</span>though
I don't comprehend those things or know if that answers that question.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">Thanks again,<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:black">Chelsey<o:p></o:p></span></p>
</div>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="98%" align="center">
</div>
<div id="divRplyFwdMsg">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:black">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:black">
<a href="mailto:cwb-bounces@sslmit.unibo.it">cwb-bounces@sslmit.unibo.it</a> <<a href="mailto:cwb-bounces@sslmit.unibo.it">cwb-bounces@sslmit.unibo.it</a>> on behalf of
<a href="mailto:cwb-request@sslmit.unibo.it">cwb-request@sslmit.unibo.it</a> <<a href="mailto:cwb-request@sslmit.unibo.it">cwb-request@sslmit.unibo.it</a>><br>
<b>Sent:</b> Thursday, July 18, 2024 12:39 PM<br>
<b>To:</b> <a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a> <<a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a>><br>
<b>Subject:</b> CWB Digest, Vol 207, Issue 7</span> <o:p></o:p></p>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt">CAUTION: The Sender of this email is not from within Dalhousie.<br>
<br>
Send CWB mailing list submissions to<br>
<a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a><br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br>
or, via email, send a message with subject or body 'help' to<br>
<a href="mailto:cwb-request@sslmit.unibo.it">cwb-request@sslmit.unibo.it</a><br>
<br>
You can reach the person managing the list at<br>
<a href="mailto:cwb-owner@sslmit.unibo.it">cwb-owner@sslmit.unibo.it</a><br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of CWB digest..."<br>
<br>
<br>
Today's Topics:<br>
<br>
1. Re: query efficiency issue (graham.ranger)<br>
2. Re: query efficiency issue (Hardie, Andrew)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Thu, 18 Jul 2024 17:18:19 +0200<br>
From: "graham.ranger" <<a href="mailto:graham.ranger@univ-avignon.fr">graham.ranger@univ-avignon.fr</a>><br>
To: Open source development of the Corpus WorkBench<br>
<<a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a>><br>
Subject: Re: [CWB] query efficiency issue<br>
Message-ID: <<a href="mailto:20240718151748.5460F200EE@zmtaauth05.partage.renater.fr">20240718151748.5460F200EE@zmtaauth05.partage.renater.fr</a>><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
Hello all,?Oddly, and for what it's worth, I created an account, ran the same query, and got the intended answers, i.e. oran and orain.Best,?Graham.Envoy? depuis mon appareil Galaxy<br>
-------- Message d'origine --------De : Stephanie Evert <<a href="mailto:stefanML@collocations.de">stefanML@collocations.de</a>> Date : 17/07/2024 13:44 (GMT+01:00) ? : CWBdev Mailing List <<a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a>> Objet
: Re: [CWB] query efficiency issue > I'm having difficulties with a query in Corpus na G?idhlig. When I search "ora[i,]n" it only retrieves "oran" instead of also retrieving "orain". Does anyone have any advice on this? Is this a bug?I suspect we'll only be
able to help you if you tell us which Web interface you used to run the query.? I suppose it is some CQPweb installation?Your query ora[i,]nshould work as a simple query (CEQL syntax) and find both words.? If it doesn't, there might be something wrong
with corpus preprocessing or indexing ? or the form simply doesn't exist in the corpus. Do you know it's actually there?You could also try different variants of the query or search for both forms separately. [oran,orain] oran orainBest,Stephanie_______________________________________________CWB
mailing listCWB@ss<br>
lmit.unibo.ithttp://liste.sslmit.unibo.it/mailman/listinfo/cwb<br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2dc602ad/attachment-0001.html">http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2dc602ad/attachment-0001.html</a>><br>
<br>
------------------------------<br>
<br>
Message: 2<br>
Date: Thu, 18 Jul 2024 15:37:23 +0000<br>
From: "Hardie, Andrew" <<a href="mailto:a.hardie@lancaster.ac.uk">a.hardie@lancaster.ac.uk</a>><br>
To: Open source development of the Corpus WorkBench<br>
<<a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a>><br>
Subject: Re: [CWB] query efficiency issue<br>
Message-ID:<br>
<<a href="mailto:LO4P265MB34858A504F7884D46F9DB1D0CBAC2@LO4P265MB3485.GBRP265.PROD.OUTLOOK.COM">LO4P265MB34858A504F7884D46F9DB1D0CBAC2@LO4P265MB3485.GBRP265.PROD.OUTLOOK.COM</a>><br>
<br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
For the record it?s this server: <a href="https://dasg.arts.gla.ac.uk/CQPweb/">https://dasg.arts.gla.ac.uk/CQPweb/</a><br>
<br>
Andrew<br>
<br>
From: <a href="mailto:cwb-bounces@sslmit.unibo.it">cwb-bounces@sslmit.unibo.it</a> <<a href="mailto:cwb-bounces@sslmit.unibo.it">cwb-bounces@sslmit.unibo.it</a>> On Behalf Of graham.ranger<br>
Sent: Thursday, July 18, 2024 4:18 PM<br>
To: Open source development of the Corpus WorkBench <<a href="mailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it</a>><br>
Subject: [External] Re: [CWB] query efficiency issue<br>
<br>
Hello all,<br>
Oddly, and for what it's worth, I created an account, ran the same query, and got the intended answers, i.e. oran and orain.<br>
Best,<br>
Graham.<br>
<br>
<br>
Envoy? depuis mon appareil Galaxy<br>
<br>
<br>
-------- Message d'origine --------<br>
De : Stephanie Evert <<a href="mailto:stefanML@collocations.de%3cmailto:stefanML@collocations.de">stefanML@collocations.de<mailto:stefanML@collocations.de</a>>><br>
Date : 17/07/2024 13:44 (GMT+01:00)<br>
? : CWBdev Mailing List <<a href="mailto:cwb@sslmit.unibo.it%3cmailto:cwb@sslmit.unibo.it">cwb@sslmit.unibo.it<mailto:cwb@sslmit.unibo.it</a>>><br>
Objet : Re: [CWB] query efficiency issue<br>
<br>
> I'm having difficulties with a query in Corpus na G?idhlig. When I search "ora[i,]n" it only retrieves "oran" instead of also retrieving "orain". Does anyone have any advice on this? Is this a bug?<br>
<br>
I suspect we'll only be able to help you if you tell us which Web interface you used to run the query. I suppose it is some CQPweb installation?<br>
<br>
Your query<br>
<br>
ora[i,]n<br>
<br>
should work as a simple query (CEQL syntax) and find both words. If it doesn't, there might be something wrong with corpus preprocessing or indexing ? or the form simply doesn't exist in the corpus. Do you know it's actually there?<br>
<br>
You could also try different variants of the query or search for both forms separately.<br>
<br>
[oran,orain]<br>
oran<br>
orain<br>
<br>
Best,<br>
Stephanie<br>
_______________________________________________<br>
CWB mailing list<br>
CWB@sslmit.unibo.it<<a href="mailto:CWB@sslmit.unibo.it">mailto:CWB@sslmit.unibo.it</a>><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2ad0f5ff/attachment.html">http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2ad0f5ff/attachment.html</a>><br>
<br>
------------------------------<br>
<br>
_______________________________________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br>
<br>
<br>
End of CWB Digest, Vol 207, Issue 7<br>
***********************************<o:p></o:p></span></p>
</div>
</div>
</div>
</div>
</body>
</html>