[CWB] CQPweb: error with subcorpus creation

Hardie, Andrew a.hardie at lancaster.ac.uk
Thu May 31 12:44:48 CEST 2018


You’re probably running out of RAM. Wrangling subcorpora that use sub-text regions is very memory-intensive (I have some ideas in the works to make it less-so).  The way to check this is (a) look in php.ini to find out how much RAM each PHP process is allowed (the memory_limit setting)  (b) watch in “top” on your server as it runs, and note that it will probably time out when the CQPweb process hits that amount of allocated memory.

(Your httpd error log may also contain a note of this error, something like “Allowed memory size of BIGNUMBER bytes exhausted (tried to allocate BIGNUMBER bytes) in php”. Any http 500 error should leave an error message in the log!)

The fix is to let PHP use more RAM. (At least for CQPweb processes). I would not worry about over-allocating RAM as long as you have an adequate swap disk your server for virtual memory when needed!

best

Andrew.

From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of José Manuel Martínez Martínez
Sent: 30 May 2018 10:48
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Subject: [CWB] CQPweb: error with subcorpus creation

Dear all,

I'm getting an internal server error when I try to create a subcorpus from a saved query.

The saved query has 58000 hits. I try to define the new subcorpus via "partial-text regions found in a saved query". I select the saved query and I use as sub-text region the structural attribute 's' that in my case denotes sentences.

After a few minutes I get an HTTP 500 ERROR.

However, if I try it with the same query but on a smaller set of hits (9615) the process is successful (the size of the resulting subcorpus is 402,802 tokens and 7700 sentences). However, sometimes I get an error when I try to generate the frequency list. I tried with a saved query slightly bigger (11600 hits) and it fails too.

Is there a way to now what's going wrong?


Cheers,
--
José Manuel Martínez Martínez
https://chozelinek.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180531/f6b20018/attachment.html>


More information about the CWB mailing list