[CWB] [CQPWeb] problem of memory

Stefan Evert stefanML at collocations.de
Wed Feb 8 12:00:10 CET 2012


> There are two possible causes of this error message. One is that, as it says, you have run out of memory. This is unlikely to be the case, but maybe (I'm speculating here) if your webserver runs under a username with restrictions on how much RAM to can use at once, you might get this - I assume a file called le_monde/word.corpus.rev is going to be rather big!

Or is it possible that you're using a 32-bit version of the CWB, which is likely to run out of address space for very large corpora?

This explanation is quite plausible because you don't seem to have compressed the index files (cwb-huffcode and cwb-compress-rdx; or simply run cwb-make from the CWB/Perl package).  Even with a 64-bit CWB, compression is highly recommended!

> The other possible cause is that the file exists, but is empty (or, perhaps, is not readable by the webserver's username??). So you should check that out as well.

I don't think this is possible, because then the previous open() [line 301] should already fail. Also note that empty files are virtually mapped beyond the end of file (MMAP_EMPTY_LEN bytes) in order to avoid throwing spurious errors.
 
> (NB to self (and Stefan), this is in cl/storage.c, see line 316 & 350 to 354 - and I am not sure why the __svr4__ macro is used at the latter point, posix-compliant systems should have MAP_FAILED defined and therefore the presence of the #ifdef seems pointless.)

Probably remnants from the good old time when POSIX could still have been the name of a gentlemen's magazine.

I guess it's about time that we require full ANSI C + POSIX compliance and throw out all the #ifdef's that work around bugs in other platforms.

Cheers,
Stefan



More information about the CWB mailing list