[CWB] cwb testing

Marco Baroni baroni at sslmit.unibo.it
Wed Jul 26 23:01:30 CEST 2006


Hi Lars!

Thanks for you reply and offer for help!

> I'll be happy to write a perl script that does the actual testing; I 
> guess each query should be run directly through cqp (throgh a shell 
> command) and through the perl modules (since there are different things 
> that might go wrong).

How about indexing (which is  one of  the most likely things to go wrong, 
in my past experience)? Should that also be run via a shell command?

> As for the testing corpus, I suggest that we use Dickens or German Law. 
> If we want to automatically generate larger corpora, we could just 
> duplicate the text in the smaller corpus. 

I thought about that, but it would give us a weird, non-Zipfian 
distribution unlike any real corpus, wouldn't it?

'Nyway, as you rightly say, the testing scripts and the test corpus/corpora 
are different issues and if you can develop the former that would be great!

Test queries: is it ok if I provide them to you in september? Realistically 
speaking, I don't think i can devote some serious thinking to it before then...

Good night,

Marco


More information about the CWB mailing list