[CWB] Aternative and patterns order in queries

Stefan Evert stefanML at collocations.de
Wed Apr 8 20:46:49 CEST 2015


> Thanks again for all these informations and advices.
> About the point 1 just below and the other query :
> 
> </q>[!q]+</text> | <text>[!q]+<q> | </q>[!q]+<q>;
> 
> What do you think about this query, do you think it could cause "wrong" results in some cases ?

This should work, but you really need to be aware of the niceties of CQP's implementation of matching XML brackets to be sure of this.  I'd very much recommend setting StrictRegions to off for this query (then you don't have to worry about the ordering of the alternatives any more).

> Besides that, could it avoid the </text> boundary crossing ? (I guess)

That's why I suggested to add "within text" to the query.  The third alternative may still cross <text> boundaries.

> I would have a complementary question, I didn't see something about it in the CQPTutorial.pdf (november 2009) file on Sourceforge. Is there anyway to do and display arithmetic operations in CQP ? The purpose here would be to compute :
> size rootCorporaAllTokens - (size subcorpora1Tokens + size subcorpora2Tokens) which should be equal to 0 (related to the structure and queries in this discussion).

No.  CQP isn't a programming language, just a corpus query tool.

I usually run CQP from Perl (using the CWB::CQP package) for such purposes.  It's very easy to obtain the counts and carry out any necessary calculations in Perl.

Best,
Stefan



More information about the CWB mailing list