[CWB] UNREADABLE

Stefan Evert stefanML at collocations.de
Fri Mar 9 10:29:58 CET 2018


> On 9 Mar 2018, at 09:37, Hardie, Andrew <a.hardie at lancaster.ac.uk> wrote:
> 
> If you’ve got multiword with spaces, then the first element and second element will be treated as separate tokens because the CQP concordance line uses space as its token delimiter. But this means the first element will have no tag… thus why a word-and-tag combination is not read.

You also won't be able to find such multiword tokens with simple (CEQL) queries:

	кеше генә

searches for a sequence of two separate tokens "кеше" and "генә".

I would recommend to write multiword tokens with an underscore, i.e "кеше_генә" etc.  You'll just have to document this so users know to specify the underscore in searches.

Best,
Stefan





More information about the CWB mailing list