[CWB] Performance of "expand to"

Stefan Evert stefanML at collocations.de
Mon Feb 13 21:10:50 CET 2012


Thanks for the additional information.  The bug has been fixed (in 3.0 and trunk).  Should work ok now if you upgrade to the latest SVN code.

I don't know how you came up with this problem, but if this is a query run from some front end (e.g. a Web server), expect your front-end to become confused -- your KWIC lines are far too long and will be truncated at a random point.  Earlier versions of CQP just used to crash in this case ...

> cat A
> 
> top shows CQP taking up 100% CPU time, while memory usage is negligible (1.4% on a 8 GB machine and plenty of free memory, so I don't think it's swapping to disk). I actually timed the query this time: it took 27.5 seconds on my fastest server.

For the record: the problem only occurs if left context + match overflow the KWIC line buffer, and right context is set to a fixed number of characters.  I couldn't reproduce the problem on my machine because my context setting defaults to "1 s".

Also for the record: CQP would actually KWIC-format everything from the match to the _end of the corpus_, throwing away each token after formatting ...

> BTW: in case you want to try it for yourself, I'm using mrscoulter.sslmit.unibo.it (you still have an account there).

Thanks, that helped me pinpoint the problem and identify the bug.

Best,
Stefan


More information about the CWB mailing list