[CWB] Problem with lemma

Hardie, Andrew a.hardie at lancaster.ac.uk
Thu Feb 25 12:00:03 CET 2016


Hi Teresa,

When you say "it doesn't work", what do you mean? Do you mean it produces an error, or it runs but finds zero results, or....

One way for you to find out what is going on:

- Do a search for the word
- Go to "download"
- Switch to tabulation mode
- download the match word and the match lemma as the two columns
- see what appears in the "lemma" column -- is it what you expect it to be?

ALTERNATIVELY: one other thing that you could check is your original input files. One common cause, when there are N p-attributes and it is the N'th that always has zero-result searches, is that your original files had Windows-style linebreaks (\r\n) so that every value in the rightmost column has a trailing "\r" attached to it (because CWB doesn't remove \ "\r" when it is running on Unix) which will cause regular expressions never to match.

Other than looking at the original input files, one way to check this would be to search for

[lemma = "gehen."]

(where the dot can match the trailing "\r", if there is one.)

If this produces the "correct" results, then it is highly likely that the "\r" problem is the cause.

Unfortunately, if this is the cause there is no fix other than scrubbing the corpus, fixing the input files, and starting from scratch.

best

Andrew.


-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Teresa Molés Cases
Sent: 25 February 2016 09:50
To: CWB at sslmit.unibo.it
Subject: [CWB] Problem with lemma

Hi everyone,

We have a corpus indexed with CQPWeb and it works perfectly except in the case of lemma. That is to say (here an example for German):

[lemma = "gehen”] (it doesn’t work)
[word = "gehen”] (it works)
[pos = "NN”] (it works)

Although the whole process of compiling the corpus has been well designed, now it does not work in the case of lemma. Any ideas why? I am afraid this has to do with the process of indexing, but if you have any suggestions, I would be very thankful to you.

Best,

Teresa

_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb


More information about the CWB mailing list