[CWB] Re: Using labels from within Perl?

Stefan Evert stefanML at collocations.de
Sun Apr 3 22:25:55 CEST 2011


> When reading the "new" version of the CQP manual, I found there are labels, but as far as I can understand they are not accessible from within the result set?

Correct.  Labels can be used internally within the query, but are not stored as part of the query result.  

In your example, of course, the tokens of interest are the first and last tokens of the match, so you could simply use the match and matchend anchors to access them (but this, admittedly, is a special case).

> [POS="V"].lemma [POS="N"].genre

If you want lists of lemmas / genre attributes, your dump/undump approach seems rather inefficient.  Why not just

	group Last match lemma;
	count Last by lemma on match;

and for s-attributes

	group Last matchend genre;

?  If you want more control, you can just tabulate the relevant information

	tabulate Last match lemma, matchend genre > "/tmp/metadata_table.txt";

and read them into a relational database.


In addition, you can mark a single query token as the "target" by prefixing it with @.  This allows you to extract information for a single internal position.  Consider this query:

> [POS="V"].lemma ... @[POS="N"].genre ... ;

You can now get the genre distribution with

	group Last target genre;

even if this token isn't at the end of the query. If you don't mind inefficient, you can just re-run the query multiple times and apply the @ marker to each "annotated" token in turn.

Best,
Stefan


More information about the CWB mailing list