[CWB] CQPweb: error with subcorpus creation

José Manuel Martínez Martínez chozelinek at gmail.com
Thu Jun 14 16:30:18 CEST 2018


Hi Andrew,

thank you very much for your answer. I've allocated more RAM to PHP
modifying the memory limit. But I was still getting some errors with bigger
sizes (46,564 Structure ``s'' units and 2,413,480 words). Then I looked
into the log file while keeping an eye on the system watching top.

Now the creation of the subcorpus work. But what it is failing is the
compilation of the frequency list.

It seems that my CQPweb has enough RAM but it is failing due to maximum
execution time. I've modified the PHP variable max_execution_time. I
started with 60 seconds, 120, and it still fails with 600.

This is the error in the log

[pid 1579] PHP Fatal error:  Maximum execution time of 600 seconds exceeded
in /var/www/html/cqpweb/lib/subcorpus.inc.php on line 4037

This is some additional information on the PID 1579

ps -fp 1579

UID        PID  PPID  C STIME TTY          TIME CMD

www-data  1579  1431  9 12:03 ?        00:11:58 /usr/sbin/apache2 -k start

When I recreate the frequency lists for the whole corpus, it takes a fairly
long time, but it normally does not fail. Could be there something in the
way subcorpus compiles the frequency list when compared with the creation
of frequency lists for the whole corpus?

Cheers,


--
José Manuel Martínez Martínez
https://chozelinek.github.io

On Thu, May 31, 2018 at 12:44 PM, Hardie, Andrew <a.hardie at lancaster.ac.uk>
wrote:

> You’re probably running out of RAM. Wrangling subcorpora that use sub-text
> regions is very memory-intensive (I have some ideas in the works to make it
> less-so).  The way to check this is (a) look in php.ini to find out how
> much RAM each PHP process is allowed (the memory_limit setting)  (b)
> watch in “top” on your server as it runs, and note that it will probably
> time out when the CQPweb process hits that amount of allocated memory.
>
>
>
> (Your httpd error log may also contain a note of this error, something
> like “Allowed memory size of BIGNUMBER bytes exhausted (tried to allocate
> BIGNUMBER bytes) in php”. Any http 500 error should leave an error message
> in the log!)
>
>
>
> The fix is to let PHP use more RAM. (At least for CQPweb processes). I
> would not worry about over-allocating RAM as long as you have an adequate
> swap disk your server for virtual memory when needed!
>
>
>
> best
>
>
>
> Andrew.
>
>
>
> *From:* cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] *On
> Behalf Of *José Manuel Martínez Martínez
> *Sent:* 30 May 2018 10:48
> *To:* Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it
> >
> *Subject:* [CWB] CQPweb: error with subcorpus creation
>
>
>
> Dear all,
>
>
>
> I'm getting an internal server error when I try to create a subcorpus from
> a saved query.
>
>
>
> The saved query has 58000 hits. I try to define the new subcorpus via
> "partial-text regions found in a saved query". I select the saved query and
> I use as sub-text region the structural attribute 's' that in my case
> denotes sentences.
>
>
>
> After a few minutes I get an HTTP 500 ERROR.
>
>
>
> However, if I try it with the same query but on a smaller set of hits
> (9615) the process is successful (the size of the resulting subcorpus
> is 402,802 tokens and 7700 sentences). However, sometimes I get an error
> when I try to generate the frequency list. I tried with a saved query
> slightly bigger (11600 hits) and it fails too.
>
>
>
> Is there a way to now what's going wrong?
>
>
>
>
>
> Cheers,
>
> --
>
> José Manuel Martínez Martínez
>
> https://chozelinek.github.io
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180614/7efdb227/attachment-0001.html>


More information about the CWB mailing list