[CWB] Impossible query

Ruprecht von Waldenfels ruprecht.waldenfels at gmx.net
Sun Feb 28 08:32:46 CET 2016


Dear Stefan,

>r 4 items for each lemma from a contiguous chunk of lines.

(ii) Are you really sure you want to do that?  What you get isn't a random sample in any sense that would allow you to draw statistical inferences.


Thanks for the comment. The goal is to restrict the influence of 
high-frequency lemmata in the next step that consists in observing the 
overall behaviour in the translated word forms. One other thing I could 
do is give multiple occurrences of the same lemma less weight, it seems 
to me, but I didnt't go for that (I don't quite remember now why, but 
the general point seemed to be that it didn't make a difference).  I 
need to normalize for frequency in some way. Any other idea?


Thanks also for pointing to the Perl/CWB Api.

Best,
Ruprecht


More information about the CWB mailing list