[CWB] cwb testing

Thu Jul 27 01:03:31 CEST 2006

Hi Lars, hi Marco!

>
> Thanks for you reply and offer for help!
>

I'd also like to say thanks very much for your offer to take care of  
the test suite implementation.

I attach the Perl source code of a CQP test suite that a student  
developed at the IMS several years ago.  Although we weren't  
completely happy with it (that's why we didn't build a more  
comprehensive testing environment around this test suite), it might  
be a useful starting point - or it might at least give you some ideas  
how to tackle the problem.

The test suite also includes a number of test queries and expected  
output (if you have the right corpus ...). I'm not sure whether it's  
in a working state or whether you first have to fix incompatibilities  
with current Perl versions and install a number of add-on modules ...

>> I'll be happy to write a perl script that does the actual testing;  
>> I guess each query should be run directly through cqp (throgh a  
>> shell command) and through the perl modules (since there are  
>> different things that might go wrong).
>

I think that in most cases we could just run CQP through the Perl  
modules.  We should discover errors regardless of whether they're  
caused by CQP or by the Perl interface, and I don't think it's  
essential for the test suite to make that distinction.  On the other  
hand, an option to run the entire test suite with shell commands  
instead of the Perl interface would definitely be neat, if that is  
feasible.

> How about indexing (which is  one of  the most likely things to go  
> wrong, in my past experience)? Should that also be run via a shell  
> command?
>

I suppose this would be an entirely different test suite, which needs  
much less flexibility in terms of the features that it can test.   
Bascially, all you have to do is encode an enclosed or automatically  
generated test corpus, then decode everything to compare with the  
original text, and decode the complete index to ensure that it's  
correct.

Would it make sense to implement this part of the test suite first -  
because it's easier and a more delimited task - and then extend the  
script to handle CQP tests as well?

>> As for the testing corpus, I suggest that we use Dickens or German  
>> Law. If we want to automatically generate larger corpora, we could  
>> just duplicate the text in the smaller corpus.
>
> I thought about that, but it would give us a weird, non-Zipfian  
> distribution unlike any real corpus, wouldn't it?
>

I'm not too happy with the two demo corpora, but they're a natural  
choice for testing (so we don't have to ship a large corpus together  
with the test suite).  At least, we should update the demo corpora  
with improved linguistic annotations + some cleanup, which we could  
release as version 1.0 before the full test suite becomes public. As  
long as there is no hand-coded gold standard for the test suite,  
changing the corpus used for evaluation shouldn't be a problem.

One alternative would be to use the Brown corpus, which is very clean  
and can always be reconstructed from the original sources. What do  
you think about that? We'll still need the demo corpora to test  
advanced features like recursive XML attributes (<np>, <np1>, ...),  
feature set attributes, etc.

My idea of an automatically generated corpus was that it should be  
created algorithmically in such a way that the precise results of  
(certain) queries can be predicted.  This wouldn't be the case if we  
just replicate DICKENS or GERMAN-LAW, and it also wouldn't work for  
Marco's suggestion of generating word forms randomly.  An algorithmic  
corpus will be particularly useful for testing the encoding and  
indexing tools, because we could then test decoding and the indices  
without having to store the automatically generated input text for  
the corpus on disk.

> 'Nyway, as you rightly say, the testing scripts and the test corpus/ 
> corpora are different issues and if you can develop the former that  
> would be great!

Yes, that would definitely be great!

Please have a look at the enclosed CQP test suite and let us know  
what you think about it, especially whether we can reuse some code or  
ideas.

Best wishes,
Stefan.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: CQP-Test.tgz
Type: application/octet-stream
Size: 26419 bytes
Desc: not available
Url : http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20060727/15f41190/CQP-Test-0001.obj