[CWB] CQPweb offline-freqlists.php problems
Uhrig, Peter
peter.uhrig at fau.de
Mon Apr 4 22:14:02 CEST 2022
Dear all,
The script offline-freqlists.php causes me some loss of sleep:
First, it throws the following error in my setup:
PHP Fatal error: Uncaught TypeError: Argument 1 passed to drop_unneeded_corpus_freqtable_components() must be of the type int, string given, called in /var/www/html/web/bin/offline-freqlists.php on line 137 and defined in /var/www/html/web/lib/freqtable-lib.php:345
Stack trace:
#0 /var/www/html/web/bin/offline-freqlists.php(137): drop_unneeded_corpus_freqtable_components()
#1 {main}
thrown in /var/www/html/web/lib/freqtable-lib.php on line 345
This seems to be a bug in the code because the function drop_unneeded_corpus_freqtable_components really requires a corpus_id, but is called with a corpus_name from offline-freqlists.php:
drop_unneeded_corpus_freqtable_components($corpus);
I have thus replaced the line with
$corpus_id = corpus_name_to_id($corpus);
drop_unneeded_corpus_freqtable_components($corpus_id);
This means it continued past the previous error, but I was greeted with a similar one:
About to run the function populating corpus CQP positions...
PHP Fatal error: Uncaught TypeError: Argument 1 passed to populate_corpus_cqp_positions() must be of the type int, string given, called in /var/www/html/web/bin/offline-freqlists.php on line 150 and defined in /var/www/html/web/lib/corpus-lib.php:1098
Stack trace:
#0 /var/www/html/web/bin/offline-freqlists.php(150): populate_corpus_cqp_positions()
#1 {main}
thrown in /var/www/html/web/lib/corpus-lib.php on line 1098
OK, same thing, replace $corpus with $corpus_id and try again. This time it gets further:
About to run the function populating corpus CQP positions...
Done populating corpus CQP positions.
Function calculating category sizes was not run because there aren't any text classifications.
According to my corpus metadata table, there ARE text classifications. Why does it say there aren't?
And finally:
About to run the function making the CWB text-by-text frequency index...
Beginning to filter data from decode to encode to build the frequency-by-text CWB index...
Segmentation fault
Encoding of the by-text CWB frequency index is now complete.
That segmentation fault most likely comes from CQP, but unfortunately it does not say what exactly was going on at the time. I noticed that the __freq folder contains indexes even for p-attributes for which I specifically selected "N" in the "Needs FT" column of the "Manage Annotation" dialogue. Is this expected?
It is still running, just not saying what it is doing, but I can see that "cwb-makeall -M 1000 -r /data/corpora/cqpweb/registry -V MY_CORPUS_NAME__FREQ" is currently running, so I guess the segmentation fault may not have been critical. I'll probably find out soon...
Any help would be greatly appreciated!
Thanks and all the best!
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20220404/fbedcd16/attachment.html>
More information about the CWB
mailing list