[CWB] Unable to index a corpus

VIVALDI PALATRESI, JORGE jorge.vivaldi at upf.edu
Wed Jul 26 13:22:52 CEST 2017


Hi Andrew,

I made the suggested modifications in the file lib/admin-install.inc.php
without any positive result the browser always becomes blank with a similar
messages in the file error.log.

Then, I reduced the size of corpus to a 20% of the original. Then some
messages appeared on the browser, but after a while the browser crashes.
Anyway I captured the included screenshot.
It seems to be a problem related to the cwb-encode and the POS tags (N5-FS,
JQ--FS, P, ...).
As I mention in a previous message the corpus (and its POS tags) is the
same used with a previous version of CQP and CQPweb.

Bests,

Jorge

2017-07-26 10:22 GMT+02:00 Hardie, Andrew <a.hardie at lancaster.ac.uk>:

> As I noted before, the problem is actually error messages. Line 644 simply
> collects error messages – so an out-of-memory error here indicates you have
> generated > 4GB of error messages.
>
>
>
> I suggested increasing the memory previously because it would let you see
> the problem – but actually, with 4GB of error messages, I’d suggest that
> doing that is not likely to help much,
>
>
>
> So what I would suggest instead is hacking the code to find out the error
> message.
>
>
>
> Open admin-lib.inbc.php
>
> Go to line 644
>
> Find the line nearby that says *$output_lines_from_cwb* = array(
> *$encode_command*);
>
>
>
> AFTER THAT LINE, but before the line that says exec(*$encode_command*,
> *$output_lines_from_cwb*, *$exit_status_from_cwb*); add the following:
>
>
>
> *if (count($output_lines_from_cwb) > 1000)
> {show_var($output_lines_from_cwb); exiterror("abort"); }*
>
>
>
> What this line does is make things abort if it detects too many error
> messages.
>
>
>
> If you then get a readable error message, that might give you a hint what
> the real problem is. If not, try again moving the location o fthe hack line
> down the file, before the following lines:
>
>
>
> before exec($makeall_command, $output_lines_from_cwb,
> $exit_status_from_cwb);
>
> before exec($compress_command, $compression_output, $exit_status_from_cwb
> );
>
> before the second example fo exec($makeall_command, $output_lines_from_cwb
> , $exit_status_from_cwb);
>
> before } */* end else (from if cwb index already exists) */*
>
>
>
> Hopefully, as I say, doing this will get you a gimpse of the first 1,000
> lines of erro, which may tell you what the underlying problem is.
>
>
>
> Hope this helps
>
>
>
> best
>
>
>
> Andrew.
>
> --
Jorge Vivaldi Palatresi
Institut Universitari de Lingüística Aplicada
Universitat Pompeu Fabra
C/ Roc Boronat, 138
08018 Barcelona
Espanya

+34 93 542 2332
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20170726/1d3feea4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CQPwebscreenshot_1.jpg
Type: image/jpeg
Size: 267182 bytes
Desc: not available
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20170726/1d3feea4/attachment-0001.jpg>


More information about the CWB mailing list