[CWB] Difference in token number between CQP and CQPweb
Stefan Evert
stefanML at collocations.de
Fri Feb 21 13:14:59 CET 2014
I'm also still having problem, but different ones. After shortening the text IDs to < 50 chars and completely re-installing the corpus in CQPweb, I get the correct corpus size and match counts.
However, frequency distributions still omit some matches. Some digging revealed a simple cause: in the freq distribution MySQL tables, the ID column is declared as VARCHAR(40)!
Is this intentional, or a bug that has been fixed in the meantime?
On a related note: When is it safe to upgrade to CQPweb 3.1?
Best,
Stefan
On 19 Feb 2014, at 11:10, Hannah Kermes <h.kermes at mx.uni-saarland.de> wrote:
> I hate to spoil the party, but I shortened the text_ids (to a max of 20 chars) of one of the problematic corpora (in the metadatatable and in the cqpcorpus), re-installed the corpus, and
> the problem stayed the same, still the same wrong token numbers.
More information about the CWB
mailing list