[CWB] A question on CQP attribute sets
Игорь Шалыминов
ishalyminov at yandex-team.ru
Tue Jul 10 15:20:32 CEST 2012
Hello!
My name is Igor, I'm a developer of Russian National Corpus search engine, and I'm trying to get it working with CWB.
The main problem I have is the following: RNC texts are annotated ambiguously for the most part, and each word has got sets of lemmas, grammar and semantic features, just as the GERMAN-LAW example in the tutorial. Suppose we have a word:
word lemma pos agr sem
------------------------------------------------------------------------------------------------------------------------
form |lemma1|lemma2| |pos1|pos2| |agr_set1|agr_set2| |sem_set1|sem_set2|
And, if I type the query:
[(lemma contains "lemma1") and (pos contains "pos2")]
I will get that very word matched, and this will be a mistake in my case since there is only one strict correspondence: "lemma1 -> pos1 -> arg_set1 -> sem_set1", and the same for lemma2.
So, my question, is there an out of the box possibility of performing such queries (i.e., controlling positions of corresponding sets while matching attribute sets with 'contains'), or it has to be implemented?
--
Best Regards,
Igor Shalyminov
More information about the CWB
mailing list