[CWB] s-attributes

Pavel Vondřička Pavel.Vondricka at ff.cuni.cz
Thu Apr 11 18:49:15 CEST 2013


The 'within' clause is demontrated here (though not much in detail):

http://cwb.sourceforge.net/files/CQP_Tutorial/node26.html

Best,
Pavel

> Dear Andrew and Stefan,
> thank you very much for your helpful answers.
> Stefan, your two solutions work both well. What do you exactly mean with
> the "within" clause to add to the query?
> Thank you very much,
> Leontyna
> 
> 
> 2013/4/11 Stefan Evert <stefanML at collocations.de>
> 
> > On 11 Apr 2013, at 16:18, "Hardie, Andrew" <a.hardie at lancaster.ac.uk>
> > 
> > wrote:
> > > Subcorpus = <p_monthstudy="[1-7]">[]*</p_monthstudy>;
> > > Subcorpus;
> > 
> > For technical reasons, it's better to use this form:
> >         Subcorpus = <p_monthstudy="[1-7]">[] expand to p_monthstudy;
> >         Subcorpus;
> > 
> > otherwise you'll lose all longer paragraphs (containing more than 100
> > tokens); on a large corpus, this form will also be substantially faster.
> > 
> > If you don't mind a loss of efficiency, you can run the query on the full
> > corpus and post-filter your results with a global constraint.  Note that
> > if
> > you're not confident about working out the correct regular expressions to
> > match single- and double-digit months correctly, you can use numeric
> > comparisons in this second version.  Perform this without activating a
> > 
> > subcorpus:
> >         ... your query ... :: int(match.p_monthstudy) >= 1 &
> > 
> > int(match.p_monthstudy) <= 7;
> > 
> > You should perhaps add a "within" clause to the query to make sure that
> > the entire match is within a single paragraph, otherwise it's not very
> > sensible to filter on the p_monthstudy attribute.
> > 
> > Hope this hilft,
> > Stefan
> > 
> > 
> > 
> > _______________________________________________
> > CWB mailing list
> > CWB at sslmit.unibo.it
> > http://devel.sslmit.unibo.it/mailman/listinfo/cwb


More information about the CWB mailing list