[CWB] Does CQPweb support dynamic attributes now?

(Ray) WU Liangping liangpingwu at 126.com
Tue Oct 6 11:05:20 CEST 2015


>> Well, for illustration of "not really executed", let's take the query [pos="N.*" & f(word)>100] on the Brown corpus for example, which returns 90,211 matches. In comparison, resetting 100 to 1000 in the previous query returns 7,205 matches (all experimented with BFSU CQPweb). This fact shows that the f() function does work. However, further "Frequency breakdown" reveals that even words with a single occurrence are in the final result set, an evidence that the f() function is not really respected (at least in some operations within CQPweb).

>
>Those are probably word forms that occur more than 100 times in the corpus, but aren't always tagged as nouns.  When I try your query on the brown family, I find words like
>
>	perfect
>	unemployed
>

>at the end of the frequency ranking, which are infrequently used as nouns.


hi Stefan,


Although I do not find "perfect" and "unemployed" in the result set (I used just Brown instead of the whole Brown family), I do find the phenomenon you mentioned. For instance, the last word in the list by "Frequency breakdown" is "over", which was incorrectly tagged as NN1 (once only) and its frequency in  the whole corpus is 1,234 (case ignored). That might explain its appearance in the result set.


But there comes another question - By the "&"operation  in the query [pos="N.*" & f(word)>100], it is supposed that both conditions on the right and left are to be satisfied. That is, the result should only return any word that is BOTH noun AND with frequency greater than 100 AS NOUN. But now the query seems to first check whether the word is a noun, then check its frequency in the whole corpus (instead of its frequency as noun). This creates some confusion in me, as I understand that the "&" operation is non-directional (but now it is carried out from left to right, one way only). Have I missed anything here? 


Since the dynamic attribute is slow, I can now understand its removal. Thanks for your suggestion for implementing the semantic restriction as a p-attribute.


Best,
Liangping
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20151006/c4dd1b44/attachment-0001.html>


More information about the CWB mailing list