<html>
<head>
<meta content="text/html; charset=KOI8-R" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi Igor,<br>
<br>
One way to do this in CWB would be to split your<br>
pos and lemma values in several positionnal attributes.<br>
For example, in this way :<br>
form lemma1 lemma2 pos1 pos2 agr_set1 agr_set2
sem_set1 sem_set2<br>
<br>
And force your queries to work coherently with<br>
corresponding attribute sets.<br>
Your example query would become :<br>
<span style="white-space: pre;">[lemma1=".*valuelemma.*" &
pos1=".*valuepos.*"]</span><br>
<br>
What do you think ?<br>
<br>
Best,<br>
Serge<br>
<br>
<br>
le 10/07/2012 15:20 Selon ้วฯาุ ๛มฬูอษฮฯื:<br>
<span style="white-space: pre;">> Hello!<br>
> <br>
> My name is Igor, I'm a developer of Russian National Corpus
search<br>
> engine, and I'm trying to get it working with CWB. The main
problem I<br>
> have is the following: RNC texts are annotated ambiguously
for the<br>
> most part, and each word has got sets of lemmas, grammar and
semantic<br>
> features, just as the GERMAN-LAW example in the tutorial.
Suppose we<br>
> have a word:<br>
> <br>
> word lemma pos agr<br>
> sem <br>
>
------------------------------------------------------------------------------------------------------------------------<br>
><br>
> </span><br>
form |lemma1|lemma2| |pos1|pos2| |agr_set1|agr_set2|
|sem_set1|sem_set2|<br>
<span style="white-space: pre;">> <br>
> And, if I type the query:<br>
> <br>
> [(lemma contains "lemma1") and (pos contains "pos2")]<br>
> <br>
> I will get that very word matched, and this will be a mistake
in my<br>
> case since there is only one strict correspondence: "lemma1
-> pos1<br>
> -> arg_set1 -> sem_set1", and the same for lemma2.<br>
> <br>
> So, my question, is there an out of the box possibility of
performing<br>
> such queries (i.e., controlling positions of corresponding
sets while<br>
> matching attribute sets with 'contains'), or it has to be<br>
> implemented?<br>
> <br>
> -- Best Regards, Igor Shalyminov <br>
> _______________________________________________ CWB mailing
list <br>
> <a class="moz-txt-link-abbreviated" href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a> <br>
> <a class="moz-txt-link-freetext" href="http://devel.sslmit.unibo.it/mailman/listinfo/cwb">http://devel.sslmit.unibo.it/mailman/listinfo/cwb</a></span><br>
<br>
-- <br>
Dr. Serge Heiden, <a class="moz-txt-link-abbreviated" href="mailto:slh@ens-lyon.fr">slh@ens-lyon.fr</a>, <a class="moz-txt-link-freetext" href="http://textometrie.ens-lyon.fr">http://textometrie.ens-lyon.fr</a><br>
ENS de Lyon/CNRS - ICAR UMR5191, Institut de Linguistique Française<br>
15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél.
+33(0)622003883<br>
<br>
<br>
<br>
<br>
</body>
</html>