[CWB] s-attributes

leontyna bratankova leontyna.b at gmail.com
Thu Apr 11 18:39:47 CEST 2013


Dear Andrew and Stefan,
thank you very much for your helpful answers.
Stefan, your two solutions work both well. What do you exactly mean with
the "within" clause to add to the query?
Thank you very much,
Leontyna


2013/4/11 Stefan Evert <stefanML at collocations.de>

> On 11 Apr 2013, at 16:18, "Hardie, Andrew" <a.hardie at lancaster.ac.uk>
> wrote:
>
> > Subcorpus = <p_monthstudy="[1-7]">[]*</p_monthstudy>;
> > Subcorpus;
>
> For technical reasons, it's better to use this form:
>
>         Subcorpus = <p_monthstudy="[1-7]">[] expand to p_monthstudy;
>         Subcorpus;
>
> otherwise you'll lose all longer paragraphs (containing more than 100
> tokens); on a large corpus, this form will also be substantially faster.
>
> If you don't mind a loss of efficiency, you can run the query on the full
> corpus and post-filter your results with a global constraint.  Note that if
> you're not confident about working out the correct regular expressions to
> match single- and double-digit months correctly, you can use numeric
> comparisons in this second version.  Perform this without activating a
> subcorpus:
>
>         ... your query ... :: int(match.p_monthstudy) >= 1 &
> int(match.p_monthstudy) <= 7;
>
> You should perhaps add a "within" clause to the query to make sure that
> the entire match is within a single paragraph, otherwise it's not very
> sensible to filter on the p_monthstudy attribute.
>
> Hope this hilft,
> Stefan
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>



-- 
Leontyna Bratankova
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20130411/8f3ee8f7/attachment.html>


More information about the CWB mailing list