[CWB] How to remove the corpus data files from cache?

Stefan Evert stefan.evert at uos.de
Mon Jan 21 22:31:11 CET 2008


On 21 Jan 2008, at 17:51, Petrakis Stefanos wrote:

> Any idea how can I un-cache/remove the corpus data files from memory?
> I want to run some tests on a "cold" cache to check time performance
> to compare the timing differences on my server between the cqp  
> client and a simple perl script running the same queries.

There's no standard way of clearing the disk cache, but it may be  
possible using system-specific commands or special software.  I'm not  
enough of a Linux expert to help you on this point (you're using  
Linux, aren't you?), but perhaps someone else on the list has a good  
idea.

>> How large are the corpora on which you've observed this behaviour?
>> There is absolutely no reason why CQP should take that long
>> on a 5- million word corpus.
>>
> The size is about 100M .

At that size, cache warming may have a substantial effect.  For  
instance, on our BNCweb installation, the first queries can take a  
minute or longer, but once most of the data are in the cache, the  
results become available within seconds.

Best,
Stefan


More information about the CWB mailing list