[CWB] UNREADABLE

Hardie, Andrew a.hardie at lancaster.ac.uk
Fri Mar 9 09:37:54 CET 2018


“Unreadable” means that the regular expression that reads the words from the concordance line couldn’t match a word-and-tag combination for that word.

If you’ve got multiword with spaces, then the first element and second element will be treated as separate tokens because the CQP concordance line uses space as its token delimiter. But this means the first element will have no tag… thus why a word-and-tag combination is not read.

The easiest way round this is to use “extra JS” files & then use JavaScript code to delete [unreadable] in your corpus visualisation for concordance, context, and download.

Or just hack the code (change “[UNREADABLE]” to an empty string; it’s in concordance-lib.inc.php)

best

Andrew.

From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of mansur
Sent: 09 March 2018 08:19
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Subject: [CWB] UNREADABLE

Hello!
It seems my previous letter wasn't delivered. Sorry for sending it again if it was delivered.

I have some complex words in my corpus like: ярдәм ит, кеше генә.
But in concordance I have:
Ул үзенең шәкертенә , янәшәдә [UNREADABLE] торуы [UNREADABLE] генә [UNREADABLE] да

[UNREADABLE] итәргә<http://cwb.corpus.tatar/CQPweb/kfu/context.php?batch=6&qname=fp6mo7xhxh&uT=y>

тиеш . – Михаил Владимирович


The first word is marked as UNREADABLE and the second word is displayed correctly. How can I fix it?
Thank you!
Mansur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180309/d9b4d513/attachment.html>


More information about the CWB mailing list