[CWB] launching searches in specific segments
Hardie, Andrew
a.hardie at lancaster.ac.uk
Tue Jun 14 13:43:48 CEST 2016
Hi Giorgina,
That query should have worked. One possibility is that you did not declare the XML / S-attributes correctly when indexing, and the XML tags have been inserted into your index as literal tokens instead of S-attribute ranges.
You can test this by querying
[word="<.*"]
and seeing if you get any results. If you do, XML has been inserted into the main p-attribute. Delete the corpus, and start over!
Also: when you have it working, you might consider using a global constraint in your query instead of incorporating the S-attribute borders into the mmain body of the query. http://cwb.sourceforge.net/files/CQP_Tutorial/node25.html
i.e.
a:[] :: a.seg_lang = "fr"
Also also: in CQPweb, you can change the datatype of seg_lang to “Classification”, and then use the Restricted Query interface to pick the language, whilst just doing queries as normal.
best
Andrew.
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Giorgina Cerutti Benitez
Sent: 14 June 2016 11:54
To: Open source development of the Corpus WorkBench
Subject: [CWB] launching searches in specific segments
Hello,
We are trying to index a bilingual corpus in CQPweb, and to look for certain words only in a specific language (e.g. only in French). Taking into account the following segment structure, we have tried with this search in CQP syntax:
<seg_lang="fr">[]*</seg_lang>
Structure of the segment:
<s id="1">
<seg lang="fr">
La
séance
est
ouverte
à
10h05
.
</seg>
Nonetheless, I wonder if it is actually possible to determine in which segment or segments we would like to search using CQP syntax and CQPweb, as I have not found information about this in the Administrator’s Manual.
Thank you in advance for all your comments and help.
Best,
Giorgina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20160614/8d138eb5/attachment.html>
More information about the CWB
mailing list