[CWB] Failure of offline-freqlists.php
Stefan Fischer
stefan.fischer at uni-saarland.de
Wed Dec 5 15:36:12 CET 2018
Hello Andrew and Mansur,
Thank you very much for all of your help.
It was indeed a memory problem. We have recently added new positional attributes and as a consequence memory consumption increased significantly.
We could 'fix' the problem by excluding a few attributes but it's already good to know what caused the problem as the error message was not that helpful.
I will also try again with more swap memory.
Best,
Stefan
Von: "mansur" <6688000 at gmail.com>
An: "cwb" <cwb at sslmit.unibo.it>
Gesendet: Montag, 3. Dezember 2018 07:46:34
Betreff: Re: [CWB] Failure of offline-freqlists.php
Hello!
Stefan, you can try to increase temporarily your swap just to check if this issue is RAM related. For example, to set additional 10Gb of swap use:
dd if=/dev/zero of=/home/USER/swap bs=1G count=10
mkswap /home/USER/swap
swapon /home/USER/swap
Best,
Mansur
Am Mo., 3. Dez. 2018 um 07:25 Uhr schrieb Hardie, Andrew < [ mailto:a.hardie at lancaster.ac.uk | a.hardie at lancaster.ac.uk ] >:
I'm afraid it is next to impossible to diagnose this at a distance. The most likely culprit is the OOM killer, if you're on Linux. There might be something in an error log (or via dmesg?)
However, there is a decent chance it will just tell you that the system was out of memory. Why this would be, I don't know. Corpus size is not the issue, rather, the number of distinct type tuples is (where a type tuple is every distinct combination of word form plus forms of all the other attributes at any given corpus position....)
best
Andrew.
-----Original Message-----
From: [ mailto:cwb-bounces at sslmit.unibo.it | cwb-bounces at sslmit.unibo.it ] < [ mailto:cwb-bounces at sslmit.unibo.it | cwb-bounces at sslmit.unibo.it ] > On Behalf Of Stefan Fischer
Sent: 28 November 2018 18:17
To: [ mailto:cwb at sslmit.unibo.it | cwb at sslmit.unibo.it ]
Subject: [CWB] Failure of offline-freqlists.php
Hello everyone,
I would like to import a corpus (300M words) into CQPweb. The corpus is already indexed in CWB and the import into CQPweb worked well. As the corpus is rather large, I ran "php offline-freqlists.php my_corpus" in the terminal. Unfortunately, the script fails after several hours and I get the following error message:
----
cwb-scan-corpus error!
Killed
PHP debugging backtrace
=======================
array(2) {
[1]=>
array(4) {
["file"]=>
string(42) "/var/www/html/cqpweb/lib/freqtable.inc.php"
["line"]=>
int(99)
["function"]=>
string(17) "exiterror_general"
["args"]=>
array(1) {
[0]=>
&string(29) "cwb-scan-corpus error!
Killed"
}
}
[2]=>
array(4) {
["file"]=>
string(46) "/var/www/html/cqpweb/bin/offline-freqlists.php"
["line"]=>
int(136)
["function"]=>
string(22) "corpus_make_freqtables"
["args"]=>
array(1) {
[0]=>
&string(10) "test_corpus"
}
}
}
----
I have already imported corpora larger than this one. So I guess corpus size is not the issue. What else could cause a failure of cwb-scan-corpus?
Best,
Stefan
_______________________________________________
CWB mailing list
[ mailto:CWB at sslmit.unibo.it | CWB at sslmit.unibo.it ]
[ http://liste.sslmit.unibo.it/mailman/listinfo/cwb | http://liste.sslmit.unibo.it/mailman/listinfo/cwb ]
_______________________________________________
CWB mailing list
[ mailto:CWB at sslmit.unibo.it | CWB at sslmit.unibo.it ]
[ http://liste.sslmit.unibo.it/mailman/listinfo/cwb | http://liste.sslmit.unibo.it/mailman/listinfo/cwb ]
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://liste.sslmit.unibo.it/mailman/listinfo/cwb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20181205/c812c700/attachment.html>
More information about the CWB
mailing list