<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thank you everyone for your replies!</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Query issue: Corpus na Gàidhlig, CQPweb, simple query (ignore case)</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I see now that the issue persists when I select "accent insensitive". Then the query "ora[i,]n" displays as this in the query history: ""ora[i,]n" %cd". I don't understand the "%cd" symbol. Is it ignore case? Either way, the accent insensitive selection affects
the query and I'll only get results with variants of "orain". As a work around I did this "<a href="https://dasg.arts.gla.ac.uk/CQPweb/dasg/index.php?ui=search&insertString=%5B%C3%B2ra%2C%C3%B2rai%2Cora%2Corai%2C%C3%B3ra%2C%C3%B3rai%2Camhra%2Camhrai%5Dn&insertType=sq_nocase" id="OWAe0262c33-bf33-f2e9-9120-a3bc8e14f8b4" class="hasToolTip OWAAutoLink" data-tooltip="Insert query string into query window" style="text-align: left;">[òra,òrai,ora,orai,óra,órai]n</a>"
and was able to get accented versions or "oran" and "orain".</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I don't know enough to verify the preprocessing or indexing question. I see in the corpus metadata that there is no word-level annotation and the STTR is not cached for the tokens<span class="_Entity _EType_OWA_HYPHEN _EId_OWA_HYPHEN _EReadonly_1" style="display: inline-block;"><span id="hyphen1" class="hyphen">—</span></span>though
I don't comprehend those things or know if that answers that question.</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thanks again,</div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Chelsey</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> cwb-bounces@sslmit.unibo.it <cwb-bounces@sslmit.unibo.it> on behalf of cwb-request@sslmit.unibo.it <cwb-request@sslmit.unibo.it><br>
<b>Sent:</b> Thursday, July 18, 2024 12:39 PM<br>
<b>To:</b> cwb@sslmit.unibo.it <cwb@sslmit.unibo.it><br>
<b>Subject:</b> CWB Digest, Vol 207, Issue 7</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">CAUTION: The Sender of this email is not from within Dalhousie.<br>
<br>
Send CWB mailing list submissions to<br>
cwb@sslmit.unibo.it<br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br>
or, via email, send a message with subject or body 'help' to<br>
cwb-request@sslmit.unibo.it<br>
<br>
You can reach the person managing the list at<br>
cwb-owner@sslmit.unibo.it<br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of CWB digest..."<br>
<br>
<br>
Today's Topics:<br>
<br>
1. Re: query efficiency issue (graham.ranger)<br>
2. Re: query efficiency issue (Hardie, Andrew)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Thu, 18 Jul 2024 17:18:19 +0200<br>
From: "graham.ranger" <graham.ranger@univ-avignon.fr><br>
To: Open source development of the Corpus WorkBench<br>
<cwb@sslmit.unibo.it><br>
Subject: Re: [CWB] query efficiency issue<br>
Message-ID: <20240718151748.5460F200EE@zmtaauth05.partage.renater.fr><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
Hello all,?Oddly, and for what it's worth, I created an account, ran the same query, and got the intended answers, i.e. oran and orain.Best,?Graham.Envoy? depuis mon appareil Galaxy<br>
-------- Message d'origine --------De : Stephanie Evert <stefanML@collocations.de> Date : 17/07/2024 13:44 (GMT+01:00) ? : CWBdev Mailing List <cwb@sslmit.unibo.it> Objet : Re: [CWB] query efficiency issue > I'm having difficulties with a query in Corpus
na G?idhlig. When I search "ora[i,]n" it only retrieves "oran" instead of also retrieving "orain". Does anyone have any advice on this? Is this a bug?I suspect we'll only be able to help you if you tell us which Web interface you used to run the query.? I
suppose it is some CQPweb installation?Your query ora[i,]nshould work as a simple query (CEQL syntax) and find both words.? If it doesn't, there might be something wrong with corpus preprocessing or indexing ? or the form simply doesn't exist in the corpus.
Do you know it's actually there?You could also try different variants of the query or search for both forms separately. [oran,orain] oran orainBest,Stephanie_______________________________________________CWB mailing listCWB@ss<br>
lmit.unibo.ithttp://liste.sslmit.unibo.it/mailman/listinfo/cwb<br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2dc602ad/attachment-0001.html">http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2dc602ad/attachment-0001.html</a>><br>
<br>
------------------------------<br>
<br>
Message: 2<br>
Date: Thu, 18 Jul 2024 15:37:23 +0000<br>
From: "Hardie, Andrew" <a.hardie@lancaster.ac.uk><br>
To: Open source development of the Corpus WorkBench<br>
<cwb@sslmit.unibo.it><br>
Subject: Re: [CWB] query efficiency issue<br>
Message-ID:<br>
<LO4P265MB34858A504F7884D46F9DB1D0CBAC2@LO4P265MB3485.GBRP265.PROD.OUTLOOK.COM><br>
<br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
For the record it?s this server: <a href="https://dasg.arts.gla.ac.uk/CQPweb/">https://dasg.arts.gla.ac.uk/CQPweb/</a><br>
<br>
Andrew<br>
<br>
From: cwb-bounces@sslmit.unibo.it <cwb-bounces@sslmit.unibo.it> On Behalf Of graham.ranger<br>
Sent: Thursday, July 18, 2024 4:18 PM<br>
To: Open source development of the Corpus WorkBench <cwb@sslmit.unibo.it><br>
Subject: [External] Re: [CWB] query efficiency issue<br>
<br>
Hello all,<br>
Oddly, and for what it's worth, I created an account, ran the same query, and got the intended answers, i.e. oran and orain.<br>
Best,<br>
Graham.<br>
<br>
<br>
Envoy? depuis mon appareil Galaxy<br>
<br>
<br>
-------- Message d'origine --------<br>
De : Stephanie Evert <stefanML@collocations.de<mailto:stefanML@collocations.de>><br>
Date : 17/07/2024 13:44 (GMT+01:00)<br>
? : CWBdev Mailing List <cwb@sslmit.unibo.it<mailto:cwb@sslmit.unibo.it>><br>
Objet : Re: [CWB] query efficiency issue<br>
<br>
> I'm having difficulties with a query in Corpus na G?idhlig. When I search "ora[i,]n" it only retrieves "oran" instead of also retrieving "orain". Does anyone have any advice on this? Is this a bug?<br>
<br>
I suspect we'll only be able to help you if you tell us which Web interface you used to run the query. I suppose it is some CQPweb installation?<br>
<br>
Your query<br>
<br>
ora[i,]n<br>
<br>
should work as a simple query (CEQL syntax) and find both words. If it doesn't, there might be something wrong with corpus preprocessing or indexing ? or the form simply doesn't exist in the corpus. Do you know it's actually there?<br>
<br>
You could also try different variants of the query or search for both forms separately.<br>
<br>
[oran,orain]<br>
oran<br>
orain<br>
<br>
Best,<br>
Stephanie<br>
_______________________________________________<br>
CWB mailing list<br>
CWB@sslmit.unibo.it<<a href="mailto:CWB@sslmit.unibo.it">mailto:CWB@sslmit.unibo.it</a>><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2ad0f5ff/attachment.html">http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20240718/2ad0f5ff/attachment.html</a>><br>
<br>
------------------------------<br>
<br>
_______________________________________________<br>
CWB mailing list<br>
CWB@sslmit.unibo.it<br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a><br>
<br>
<br>
End of CWB Digest, Vol 207, Issue 7<br>
***********************************<br>
</div>
</span></font></div>
</body>
</html>