[CWB] Any way to filter out some preceding tokens in a CQP query?

Stefan Evert stefanML at collocations.de
Wed Oct 17 00:34:13 CEST 2012


> CQP's zero-width assertions are quite handy when we want to filter out some tokens after a CQP query. A typical example is given in Stephen's "Inside the IMS Corpus Workbench" presentation to find clausal verb complement:
> [pos="VB.*"] "that" [:pos!= "JJ.*| N.*":]
> 
> My question is: is there a way in CQP to limit the content BEFORE a CQP query? I guess that mechanism might be equally handy in other situations.

No, unfortunately not.  These "zero-width assertions" were implemented for an entirely different purpose (being able to make assertions similar to the global constraint at arbitrary places within a CQP query), and the ability to check one additional token after the end of the match is just a useful side-effect.

If you're working directly with CQP or with an interface that's under your control, you can easily filter out unwanted matches after running the query, like so:

	A = [pos="VB.*"] "that" [:pos!= "JJ.*| N.*":];
	set A keyword nearest [pos = "N.*"] within left 1 word;
	delete A with keyword;

etc.  This mechanism is much more flexible than zero-width assertions, of course, but it cannot be used through a Web interface (such as BNCweb or CQPweb) that only allows users to enter plain CQP queries.

Hope this helps
Stefan

	


More information about the CWB mailing list