[CWB] Multi-word units
Stefan Evert
stefanML at collocations.de
Fri Feb 15 18:02:37 CET 2013
On 14 Feb 2013, at 23:12, "Hardie, Andrew" <a.hardie at lancaster.ac.uk> wrote:
> I was thinking of this kind of arrangement:
>
> apressurada apressuradamientre
> mientre {some kind of ditto mark or just __NULL__}
>
> .... so that subsequent tokens on the two attributes stay in sync.
That's neat, but it doesn't work in (naive) queries, especially if users are not aware which words are multi-word tokens. They'd have to write something like
[pos = "adverb"] [word = "__NULL__"]? [ ... ] [word = "__NULL__"]? ...
Would an option to automatically ignore certain tokens (e.g. __NULL__ tokens, or all punctuation marks) in CQP queries be something useful for the wishlist and worth giving a try a the hackathon?
Cheers,
Stefan
More information about the CWB
mailing list