[CWB] Difference in token number between CQP and CQPweb

Stefan Evert stefanML at collocations.de
Fri Feb 21 13:14:59 CET 2014


I'm also still having problem, but different ones.  After shortening the text IDs to < 50 chars and completely re-installing the corpus in CQPweb, I get the correct corpus size and match counts.

However, frequency distributions still omit some matches.  Some digging revealed a simple cause: in the freq distribution MySQL tables, the ID column is declared as VARCHAR(40)!

Is this intentional, or a bug that has been fixed in the meantime?

On a related note: When is it safe to upgrade to CQPweb 3.1?

Best,
Stefan

On 19 Feb 2014, at 11:10, Hannah Kermes <h.kermes at mx.uni-saarland.de> wrote:

> I hate to spoil the party, but I shortened the text_ids (to a max of 20 chars) of one of the problematic corpora (in the metadatatable and in the cqpcorpus), re-installed the corpus, and
> the problem stayed the same, still the same wrong token numbers.



More information about the CWB mailing list