[CWB] Failure of offline-freqlists.php

Stefan Fischer stefan.fischer at uni-saarland.de
Wed Dec 5 15:36:12 CET 2018


Hello Andrew and Mansur, 

Thank you very much for all of your help. 

It was indeed a memory problem. We have recently added new positional attributes and as a consequence memory consumption increased significantly. 
We could 'fix' the problem by excluding a few attributes but it's already good to know what caused the problem as the error message was not that helpful. 
I will also try again with more swap memory. 

Best, 
Stefan 


Von: "mansur" <6688000 at gmail.com> 
An: "cwb" <cwb at sslmit.unibo.it> 
Gesendet: Montag, 3. Dezember 2018 07:46:34 
Betreff: Re: [CWB] Failure of offline-freqlists.php 

Hello! 

Stefan, you can try to increase temporarily your swap just to check if this issue is RAM related. For example, to set additional 10Gb of swap use: 
dd if=/dev/zero of=/home/USER/swap bs=1G count=10 
mkswap /home/USER/swap 
swapon /home/USER/swap 

Best, 
Mansur 


Am Mo., 3. Dez. 2018 um 07:25 Uhr schrieb Hardie, Andrew < [ mailto:a.hardie at lancaster.ac.uk | a.hardie at lancaster.ac.uk ] >: 


I'm afraid it is next to impossible to diagnose this at a distance. The most likely culprit is the OOM killer, if you're on Linux. There might be something in an error log (or via dmesg?) 

However, there is a decent chance it will just tell you that the system was out of memory. Why this would be, I don't know. Corpus size is not the issue, rather, the number of distinct type tuples is (where a type tuple is every distinct combination of word form plus forms of all the other attributes at any given corpus position....) 

best 

Andrew. 

-----Original Message----- 
From: [ mailto:cwb-bounces at sslmit.unibo.it | cwb-bounces at sslmit.unibo.it ] < [ mailto:cwb-bounces at sslmit.unibo.it | cwb-bounces at sslmit.unibo.it ] > On Behalf Of Stefan Fischer 
Sent: 28 November 2018 18:17 
To: [ mailto:cwb at sslmit.unibo.it | cwb at sslmit.unibo.it ] 
Subject: [CWB] Failure of offline-freqlists.php 

Hello everyone, 

I would like to import a corpus (300M words) into CQPweb. The corpus is already indexed in CWB and the import into CQPweb worked well. As the corpus is rather large, I ran "php offline-freqlists.php my_corpus" in the terminal. Unfortunately, the script fails after several hours and I get the following error message: 

---- 

cwb-scan-corpus error! 
Killed 



PHP debugging backtrace 
======================= 
array(2) { 
[1]=> 
array(4) { 
["file"]=> 
string(42) "/var/www/html/cqpweb/lib/freqtable.inc.php" 
["line"]=> 
int(99) 
["function"]=> 
string(17) "exiterror_general" 
["args"]=> 
array(1) { 
[0]=> 
&string(29) "cwb-scan-corpus error! 
Killed" 
} 
} 
[2]=> 
array(4) { 
["file"]=> 
string(46) "/var/www/html/cqpweb/bin/offline-freqlists.php" 
["line"]=> 
int(136) 
["function"]=> 
string(22) "corpus_make_freqtables" 
["args"]=> 
array(1) { 
[0]=> 
&string(10) "test_corpus" 
} 
} 
} 

---- 

I have already imported corpora larger than this one. So I guess corpus size is not the issue. What else could cause a failure of cwb-scan-corpus? 

Best, 
Stefan 
_______________________________________________ 
CWB mailing list 
[ mailto:CWB at sslmit.unibo.it | CWB at sslmit.unibo.it ] 
[ http://liste.sslmit.unibo.it/mailman/listinfo/cwb | http://liste.sslmit.unibo.it/mailman/listinfo/cwb ] 
_______________________________________________ 
CWB mailing list 
[ mailto:CWB at sslmit.unibo.it | CWB at sslmit.unibo.it ] 
[ http://liste.sslmit.unibo.it/mailman/listinfo/cwb | http://liste.sslmit.unibo.it/mailman/listinfo/cwb ] 




_______________________________________________ 
CWB mailing list 
CWB at sslmit.unibo.it 
http://liste.sslmit.unibo.it/mailman/listinfo/cwb 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20181205/c812c700/attachment.html>


More information about the CWB mailing list