[CWB] A question on CQP attribute sets

Stefan Evert stefanML at collocations.de
Tue Jul 10 19:34:17 CEST 2012


On 10 Jul 2012, at 15:50, Ruprecht von Waldenfels wrote:

> Therefore, I feel one must go for a rather complex annotation, which can then be queried by using a regular expression. The machinery is
> 
> FORM ANNOTATION
> dam  1:SG:PF-dat::GEN:PL-dama-
> 
> and then you can query for, say, Genitive by searching for 
> 
> [annot=".*:GEN:.*"]
> 
> for dama 'lady'
> 
> [annot=".*-dama-.*"]
> 
> for the  combination (genitives of dama)
> 
> [annot=".*GEN[^-]*-dama-.*"]

Or if you want to use the standard feature sets, format your annotation like this

dam		|:1:SG:PF:-dat-|:GEN:PL:-dama-|

Then you can write the queries above using "contains" and "matches" operator, which ensures that your regexp only matches within one alternative:

	[annot contains ".*:GEN:.*"]

	[annot contains ".*-dama-"]

	[annot contains ".*:GEN:.*-dama-"]

	[annot matches ".*-dat-"]  # ensures that all analyses are forms of "dat" => no lemma ambiguity

Best,
Stefan


More information about the CWB mailing list