[CWB] cwb testing
Marco Baroni
baroni at sslmit.unibo.it
Wed Jul 26 23:01:30 CEST 2006
Hi Lars!
Thanks for you reply and offer for help!
> I'll be happy to write a perl script that does the actual testing; I
> guess each query should be run directly through cqp (throgh a shell
> command) and through the perl modules (since there are different things
> that might go wrong).
How about indexing (which is one of the most likely things to go wrong,
in my past experience)? Should that also be run via a shell command?
> As for the testing corpus, I suggest that we use Dickens or German Law.
> If we want to automatically generate larger corpora, we could just
> duplicate the text in the smaller corpus.
I thought about that, but it would give us a weird, non-Zipfian
distribution unlike any real corpus, wouldn't it?
'Nyway, as you rightly say, the testing scripts and the test corpus/corpora
are different issues and if you can develop the former that would be great!
Test queries: is it ok if I provide them to you in september? Realistically
speaking, I don't think i can devote some serious thinking to it before then...
Good night,
Marco
More information about the CWB
mailing list