[CWB] count right or left ocurrences

Stefan Evert stefanML at collocations.de
Thu Feb 25 12:39:49 CET 2010


> I need to show the results of these number of ocurrences
>
> Is there a way to get these type of result?

Yes, that's easy, though I'm not sure how well this functionality is  
known. So perhaps this is also a useful hint for other people on the  
list.

> word: go

Go = "go"%c;

>
> result of 1 word right:
> word \t nbocurrence
> and  2873
> back  1982
> that 1893
> on  923
> ..
>

group Go match[1] word;

> result of 2 words right:
> word \t nbocurrence
> to  3889
> back  1998
> ..

group Go match[2] word;


You can get 1 word left with

group Go match[-1] word;

If you want the results to look more like your tables above, set

set PrettyPrint off;

first.

> result of 1 word left or right:
> and  2873
> to 2399
> I 2300
> ...

This one is difficult to do directly in CQP (yet).  My recommendation  
would be to save the tables above to text files and then sum them up  
with Perl, R or SQLite -- it's very easy in any of these languages.

If you go through a few extra hoops and don't worry about efficiency,  
you can get these co-occurrence counts in CQP with a subquery trick  
that doesn't scale well -- I wonder if anyone on the list can figure  
out how to do it?

Should we start a monthly "CQP golf" contest? :-)

Cheers,
Stefan



More information about the CWB mailing list