[CWB] Problem with lemma
Teresa Molés Cases
teresamoles at gmail.com
Mon Feb 29 11:56:52 CET 2016
Dear Andrew,
Thanks a lot for your answer and your help.
When I write [lemma = "gehen.”], it works. So the probem is the “\r”.
Best,
Teresa
> El 25 feb 2016, a las 12:00, Hardie, Andrew <a.hardie at lancaster.ac.uk> escribió:
>
> Hi Teresa,
>
> When you say "it doesn't work", what do you mean? Do you mean it produces an error, or it runs but finds zero results, or....
>
> One way for you to find out what is going on:
>
> - Do a search for the word
> - Go to "download"
> - Switch to tabulation mode
> - download the match word and the match lemma as the two columns
> - see what appears in the "lemma" column -- is it what you expect it to be?
>
> ALTERNATIVELY: one other thing that you could check is your original input files. One common cause, when there are N p-attributes and it is the N'th that always has zero-result searches, is that your original files had Windows-style linebreaks (\r\n) so that every value in the rightmost column has a trailing "\r" attached to it (because CWB doesn't remove \ "\r" when it is running on Unix) which will cause regular expressions never to match.
>
> Other than looking at the original input files, one way to check this would be to search for
>
> [lemma = "gehen."]
>
> (where the dot can match the trailing "\r", if there is one.)
>
> If this produces the "correct" results, then it is highly likely that the "\r" problem is the cause.
>
> Unfortunately, if this is the cause there is no fix other than scrubbing the corpus, fixing the input files, and starting from scratch.
>
> best
>
> Andrew.
>
>
> -----Original Message-----
> From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Teresa Molés Cases
> Sent: 25 February 2016 09:50
> To: CWB at sslmit.unibo.it
> Subject: [CWB] Problem with lemma
>
> Hi everyone,
>
> We have a corpus indexed with CQPWeb and it works perfectly except in the case of lemma. That is to say (here an example for German):
>
> [lemma = "gehen”] (it doesn’t work)
> [word = "gehen”] (it works)
> [pos = "NN”] (it works)
>
> Although the whole process of compiling the corpus has been well designed, now it does not work in the case of lemma. Any ideas why? I am afraid this has to do with the process of indexing, but if you have any suggestions, I would be very thankful to you.
>
> Best,
>
> Teresa
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list