[CWB] CWB::CQP - adding easy of use

Stefan Evert stefanML at collocations.de
Mon Apr 5 13:21:47 CEST 2010


Dear Alberto,

thanks for your enthusiasm, and for sending us a first version of your  
CWB::CQP::More module!

I'm going the opposite way with the basic CWB/Perl interface, i.e. I'm  
trying to streamline the interface using as few general-purpose  
methods as possible (easier to maintain, test, optimise, and also to  
learn -- if you know the CQP commands and output well enough, that  
is), and I'm trying to avoid dependencies on other Perl modules.  The  
latter is also the reason why I've broken CWB/Perl into several  
completely separate packages: the base CWB package has minimal  
dependencies and provides basic functionality that you can rely on  
practically everywhere the CWB runs; CWB-CL requires a working C  
compiler; future versions of CWB-Web will have many external  
dependencies such as DBD::SQLite; and CWB-CQI can be installed  
anywhere even without a working CWB installation (if you just want to  
use it as a client that connects to a remote server).

I believe that convenience functions like those suggested by you (and  
I can think of many other examples) should be package in a separate  
distribution.  So as not to complicate downloads, how about putting  
everything in a common package called CWB-Contrib?  I'll be happy to  
set up such a package in the cwb.sf.net, so everyone can add their  
favourite convenience methods (including tests!) and send me the diffs  
-- regular contributors can also have write access to the SVN if they  
want.

The CWB-Contrib package would not be restricted in the range of  
methods provided, and is allowed to depend on all sorts of other Perl  
modules.  The understanding is that the basic CWB/Perl interface  
should be relatively easy to install, but if you want the bells and  
whistles, you may have to put in some extra work.  That said, we  
should still make an effort to avoid unnecessary dependencies (see  
specific comments on your code below).

The cleanest solution would be to keep all the extra modules in a  
CWB::Contrib namespace, e.g. CWB::Contrib::CQPplus or so for Alberto's  
module.  I might be persuaded to allow more natural names like  
CWB::CQP::More, though ...

> Enough of talk. Code!

Righto! :-)

> 1) while exec allows to do anything, shortcuts will be great:
>     @corpora = $cqp->show_corpora()  # or list_corpora
>     $cqp->set(LEFTKWICDelim => '<b>', RightKWICDelim => '</b>');

Neat idea.  Actually, this method could be made aware of available  
options (and perhaps even their accepted values), so it could produce  
clearer error messages if misused.  Or at least include a list of  
options with short descriptions in the documentation of the set()  
method.

> 2) while the $cqp->ok method is enough, with the Try::Tiny module  
> things
> could get a lot cleaner.

Agreed (I hadn't seen this module before), but I think this makes more  
sense at an "end-user" level, i.e. for people writing Perl scripts  
that use the CWB modules.  In a module, the extra work to write

   $cqp->exec(...);
   if (not $cqp->ok) {
     ...
   }

or using the standard eval mechanism instead of

   try {
     $cqp->exec();
   } catch {
     ...
   }

isn't too bad -- after all, this is "write once, use often" code.  In  
this way, you could avoid the dependency of you CWB::CQP::More on  
Try::Tiny; but you might still want to recommend that users install  
Try::Tiny and use it in their own scripts (e.g. in the documentation  
of your module, or on the CWB wiki).

By the way, your suggested wrapper

>      sub exec {
>          my ($self, @args) = @_;
>          my $answer = $self->old_exec(@args);
>          die ($self->error_message) unless $self->ok;
>          return $answer;
>      }

is quite unnecessary.  The standard way of achieving this behaviour is

   $cqp->set_error_handler('die');

That's the setting I typically use in my one-off Perl scripts;  
together with CWB::CL::Strict it means that I don't have to do any  
error checking, as the script will simply abort if anything unexpected  
happens.

All the best,
Stefan



More information about the CWB mailing list