[CWB] Display options for structural attributes

Stefan Evert stefanML at collocations.de
Fri Sep 17 22:18:44 CEST 2010


Hi Lukas!

> As far as I know [1],  when you display S-attributes, they are displayed in
> the position in which they actually appear in the corpus [2].
> 
> I'd like to be able to say something like "show +story:num" and then get the value
> of the num attribute of the story tag for each hit.  

Are you looking for 

	set PrintStructures "story_num";

?

> This could be useful for
> computing tf-idf weights, for example.  E.g. the query
> 
>> "A"
> 
> would yield the result
> 
> 2: A/DT/a/1
> 11: A/DT/a/2

For automatic processing, "tabulate" is often more convenient than tweaking the output of "cat".  For instance, you can get exactly the same information in nice TAB-delimited form with

	tabulate Last match, match .. matchend word, match .. matchend pos, match story_num;

> Otherwise, I'd have to encode the story number as a P-attribute for each
> token, which would store redundant information and require more annoying
> preprocessing ;).

Yes, I used to do that back when many CQP commands (such as "group") would only work on p-attributes ...

Cheers,
Stefan


More information about the CWB mailing list