[CWB] Question about CWB::CQP::More

Stefan Evert stefanML at collocations.de
Sun Dec 12 18:26:15 CET 2010


Hi Scott & Alberto!

In case this hasn't been answered yet -- I'm travelling and haven't been able to check e-mail for a day or two -- here's an explanation of Scott's mistake.  Don't worry, the mistake was very hard to spot and the question had me stumped at first.

Scott, you've mixed up the name of a query result (which is something you "cat" in CQP) and the name of a p-attribute (which you use in query expressions). This may in part be due to the fact that the Perl variable for the latter is somewhat counterintuitively named $query_type.

If you ran _all_ the commands the CWB::CQP::More interface executes in an interactive CQP session, you would notice that the whole thing misbehaves in the same way there.

> my $query_type = "lemma";
> my $query_item = "importante";
> my $query_to_send_to_cqp = "\[$query_type = \"$query_item\"\];";

This is correct, but you don't need to escape the square brackets in double quoted strings (no harm done, though, as you checked yourself).

BTW, this is going to blow up in your face if $query_item ever contains double quotes or a single trailing backslash.  It's safer to quote the string properly through CWB::CQP:

  my $query_to_send_to_cqp = "[$query_type = ".$cqp->quote($query_item)."]";

> $cqp->exec($query_to_send_to_cqp);

This executes the query, e.g. [lemma = "importante"];.  In interactive CQP, it automatically displays the results; in the CWB::CQP, it silently assigns the results to the named query "Last".

> my $result_size = $cqp->size($query_type);
> my @lines = $cqp->cat($query_type);

These lines correspond to interactive CQP commands

  size lemma;
  cat lemma;

i.e. you're trying to size and cat a corpus attribute!  So no wonder this fails. Actually, CQP should complain with a big error message like

> CQP Error:
> 	Corpus ``lemma'' is undefined

Either Alberto's wrapper is suppressing these error messages or you've run into a bug in the error handler of recent CWB::CQP releases, which I've only fixed a short time ago.

So, the correct sequence of commands is

  my $query_name = "Q1";
  my $query_to_send_to_cqp = "$query_name = [$query_type = ".$cqp->quote($query_item)."]";
  my $result_size = $cqp->size($query_name);
  my @lines = $cqp->cat($query_name);

> But this is what happens when I do the searches through the CQB::CQP::More module:
> 
> word = "importante";    =>    43 matches
> word = "importantes";    =>    18 matches
> lemma = "importante";    =>    43 matches

In case you haven't guessed already from the explanation above: these invalid queries are misinterpreted by CQP, which executes -- in the last case -- the query

  "importante";  -- i.e. [word = "importante"]

and assigns it to the named result "lemma".  So your following commands _seem_ to work because you've created this spurious named query result.

CQP shouldn't really allow lowercase query names in the first place, of course, but here we haven't put in strict checks yet.


Cheers,
Stefan


More information about the CWB mailing list