[CWB] cwb testing
Stefan Evert
stefan.evert at osnanet.de
Thu Jul 27 01:03:31 CEST 2006
Hi Lars, hi Marco!
>
> Thanks for you reply and offer for help!
>
I'd also like to say thanks very much for your offer to take care of
the test suite implementation.
I attach the Perl source code of a CQP test suite that a student
developed at the IMS several years ago. Although we weren't
completely happy with it (that's why we didn't build a more
comprehensive testing environment around this test suite), it might
be a useful starting point - or it might at least give you some ideas
how to tackle the problem.
The test suite also includes a number of test queries and expected
output (if you have the right corpus ...). I'm not sure whether it's
in a working state or whether you first have to fix incompatibilities
with current Perl versions and install a number of add-on modules ...
>> I'll be happy to write a perl script that does the actual testing;
>> I guess each query should be run directly through cqp (throgh a
>> shell command) and through the perl modules (since there are
>> different things that might go wrong).
>
I think that in most cases we could just run CQP through the Perl
modules. We should discover errors regardless of whether they're
caused by CQP or by the Perl interface, and I don't think it's
essential for the test suite to make that distinction. On the other
hand, an option to run the entire test suite with shell commands
instead of the Perl interface would definitely be neat, if that is
feasible.
> How about indexing (which is one of the most likely things to go
> wrong, in my past experience)? Should that also be run via a shell
> command?
>
I suppose this would be an entirely different test suite, which needs
much less flexibility in terms of the features that it can test.
Bascially, all you have to do is encode an enclosed or automatically
generated test corpus, then decode everything to compare with the
original text, and decode the complete index to ensure that it's
correct.
Would it make sense to implement this part of the test suite first -
because it's easier and a more delimited task - and then extend the
script to handle CQP tests as well?
>> As for the testing corpus, I suggest that we use Dickens or German
>> Law. If we want to automatically generate larger corpora, we could
>> just duplicate the text in the smaller corpus.
>
> I thought about that, but it would give us a weird, non-Zipfian
> distribution unlike any real corpus, wouldn't it?
>
I'm not too happy with the two demo corpora, but they're a natural
choice for testing (so we don't have to ship a large corpus together
with the test suite). At least, we should update the demo corpora
with improved linguistic annotations + some cleanup, which we could
release as version 1.0 before the full test suite becomes public. As
long as there is no hand-coded gold standard for the test suite,
changing the corpus used for evaluation shouldn't be a problem.
One alternative would be to use the Brown corpus, which is very clean
and can always be reconstructed from the original sources. What do
you think about that? We'll still need the demo corpora to test
advanced features like recursive XML attributes (<np>, <np1>, ...),
feature set attributes, etc.
My idea of an automatically generated corpus was that it should be
created algorithmically in such a way that the precise results of
(certain) queries can be predicted. This wouldn't be the case if we
just replicate DICKENS or GERMAN-LAW, and it also wouldn't work for
Marco's suggestion of generating word forms randomly. An algorithmic
corpus will be particularly useful for testing the encoding and
indexing tools, because we could then test decoding and the indices
without having to store the automatically generated input text for
the corpus on disk.
> 'Nyway, as you rightly say, the testing scripts and the test corpus/
> corpora are different issues and if you can develop the former that
> would be great!
Yes, that would definitely be great!
Please have a look at the enclosed CQP test suite and let us know
what you think about it, especially whether we can reuse some code or
ideas.
Best wishes,
Stefan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CQP-Test.tgz
Type: application/octet-stream
Size: 26419 bytes
Desc: not available
Url : http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20060727/15f41190/CQP-Test-0001.obj
More information about the CWB
mailing list