[CWB] bugs

Lars Nygaard lars.nygaard at iln.uio.no
Tue May 16 14:38:57 CEST 2006


Hi all,

Have we decided what bugtracking system to use? I've registred to bug 
reports on sourceforge, but I guess we might want to use something 
different. The text of the bug reports are reproduced below.

regards,
lars nygaard



** "cut" applies to early **

When using CWB for parallell corpora, the "cut" keyword
does not give the correct results: It is applied to the
first corpus, and does not take into account that there
can be restrictions on the aligned regions as well,
thus returning to few hits.



** WebCqp::Query fail on long sentences **

The combination of long sentences and many positional attributes seems 
to cause WebCqp::Query to fail: the process hangs at 99 % cpu usage, but 
nothing happens.

In my particular case, it was 16 attributes (a detailed morphological 
and syntactic analysis of Norwegian) and some sentences of more than a 
100 words. If necessary, I can provide some exact numbers here.

With 15 attributes, the query works, but I suspect there will be 
problems with queries returning even longer sentences (and there are 
quite a few, since the corpus conains literary text, and some authors 
produce sentences of many hundreds of words).


More information about the CWB mailing list