[CWB] Failure of offline-freqlists.php
Hardie, Andrew
a.hardie at lancaster.ac.uk
Mon Dec 3 05:24:58 CET 2018
I'm afraid it is next to impossible to diagnose this at a distance. The most likely culprit is the OOM killer, if you're on Linux. There might be something in an error log (or via dmesg?)
However, there is a decent chance it will just tell you that the system was out of memory. Why this would be, I don't know. Corpus size is not the issue, rather, the number of distinct type tuples is (where a type tuple is every distinct combination of word form plus forms of all the other attributes at any given corpus position....)
best
Andrew.
-----Original Message-----
From: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> On Behalf Of Stefan Fischer
Sent: 28 November 2018 18:17
To: cwb at sslmit.unibo.it
Subject: [CWB] Failure of offline-freqlists.php
Hello everyone,
I would like to import a corpus (300M words) into CQPweb. The corpus is already indexed in CWB and the import into CQPweb worked well. As the corpus is rather large, I ran "php offline-freqlists.php my_corpus" in the terminal. Unfortunately, the script fails after several hours and I get the following error message:
----
cwb-scan-corpus error!
Killed
PHP debugging backtrace
=======================
array(2) {
[1]=>
array(4) {
["file"]=>
string(42) "/var/www/html/cqpweb/lib/freqtable.inc.php"
["line"]=>
int(99)
["function"]=>
string(17) "exiterror_general"
["args"]=>
array(1) {
[0]=>
&string(29) "cwb-scan-corpus error!
Killed"
}
}
[2]=>
array(4) {
["file"]=>
string(46) "/var/www/html/cqpweb/bin/offline-freqlists.php"
["line"]=>
int(136)
["function"]=>
string(22) "corpus_make_freqtables"
["args"]=>
array(1) {
[0]=>
&string(10) "test_corpus"
}
}
}
----
I have already imported corpora larger than this one. So I guess corpus size is not the issue. What else could cause a failure of cwb-scan-corpus?
Best,
Stefan
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://liste.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list