[CWB] launching searches in specific segments

Hardie, Andrew a.hardie at lancaster.ac.uk
Tue Jun 14 13:43:48 CEST 2016


Hi Giorgina,

That query should have worked. One possibility is that you did not declare the XML /  S-attributes correctly when indexing, and the XML tags have been inserted into your index as literal tokens instead of S-attribute ranges.

You can test this by querying

[word="<.*"]

and seeing if you get any results. If you do, XML has been inserted into the main p-attribute. Delete the corpus, and start over!

Also: when you have it working, you might consider using a global constraint  in your query instead of incorporating the S-attribute borders into the mmain body of the query. http://cwb.sourceforge.net/files/CQP_Tutorial/node25.html

i.e.

          a:[] :: a.seg_lang = "fr"

Also also: in CQPweb, you can change the datatype of seg_lang to “Classification”, and then use the Restricted Query  interface to pick the language, whilst just doing queries as normal.

best

Andrew.

From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Giorgina Cerutti Benitez
Sent: 14 June 2016 11:54
To: Open source development of the Corpus WorkBench
Subject: [CWB] launching searches in specific segments

Hello,

We are trying to index a bilingual corpus in CQPweb, and to look for certain words only in a specific language (e.g. only in French). Taking into account the following segment structure, we have tried with this search in CQP syntax:

<seg_lang="fr">[]*</seg_lang>


Structure of the segment:


<s id="1">
<seg lang="fr">
La
séance
est
ouverte
à
10h05
.
</seg>

Nonetheless, I wonder if it is actually possible to determine in which segment or segments we would like to search using CQP syntax and CQPweb, as I have not found information about this in the Administrator’s Manual.

Thank you in advance for all your comments and help.

Best,

Giorgina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20160614/8d138eb5/attachment.html>


More information about the CWB mailing list