[CWB] cqp and very large corpora

Nikola Tulechki nikola.tulechki at gmail.com
Wed Nov 14 11:25:58 CET 2012


I understand.
I just wanted to know if there were any available solutions..

As for better hardware, I am currently running cqp on pretty high-end
hardware (xeon, 12gb ram) and, exept investing in SSDs in order to speed it
up some more, there is not much room for improvement...

But still, I am aware that querying 1.5G words in 3-4 minutes is allready
pretty cool and I thank you for making this the tool

regards
NT



On Sun, Nov 11, 2012 at 10:16 PM, Hardie, Andrew
<a.hardie at lancaster.ac.uk>wrote:

>  Better hardware?****
>
> ** **
>
> I know this sounds glib, but re-engineering CWB to make it multithreaded
> or to use ancillary database indexes would be a huge undertaking. Throwing
> better hardware at the problem will almost certainly cost you less than the
> programmer time to rewrite large chunks of CWB from the ground up.****
>
> ** **
>
> best****
>
> ** **
>
> Andrew.****
>
> ** **
>
> *From:* cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] *On
> Behalf Of *Nikola Tulechki
> *Sent:* 11 November 2012 07:58
> *To:* Open source development of the Corpus WorkBench
> *Subject:* [CWB] cqp and very large corpora****
>
> ** **
>
> Hello****
>
> ** **
>
> I am using cqp with the *WAC corpora (1.5G words) and, while not
> prohibiting, response times are still in the minutes range. ****
>
> Are there any ways to further speed-up the tool?****
>
> Multithreading? Indexes stored in RAM, in DB? ****
>
> ** **
>
> Thanks****
>
> NT****
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121114/c5881238/attachment.html>


More information about the CWB mailing list