[CWB] Problem with lemma

Teresa Molés Cases teresamoles at gmail.com
Mon Feb 29 11:56:52 CET 2016


Dear Andrew,

Thanks a lot for your answer and your help.

When I write [lemma = "gehen.”], it works. So the probem is the “\r”.

Best,

Teresa


> El 25 feb 2016, a las 12:00, Hardie, Andrew <a.hardie at lancaster.ac.uk> escribió:
> 
> Hi Teresa,
> 
> When you say "it doesn't work", what do you mean? Do you mean it produces an error, or it runs but finds zero results, or....
> 
> One way for you to find out what is going on:
> 
> - Do a search for the word
> - Go to "download"
> - Switch to tabulation mode
> - download the match word and the match lemma as the two columns
> - see what appears in the "lemma" column -- is it what you expect it to be?
> 
> ALTERNATIVELY: one other thing that you could check is your original input files. One common cause, when there are N p-attributes and it is the N'th that always has zero-result searches, is that your original files had Windows-style linebreaks (\r\n) so that every value in the rightmost column has a trailing "\r" attached to it (because CWB doesn't remove \ "\r" when it is running on Unix) which will cause regular expressions never to match.
> 
> Other than looking at the original input files, one way to check this would be to search for
> 
> [lemma = "gehen."]
> 
> (where the dot can match the trailing "\r", if there is one.)
> 
> If this produces the "correct" results, then it is highly likely that the "\r" problem is the cause.
> 
> Unfortunately, if this is the cause there is no fix other than scrubbing the corpus, fixing the input files, and starting from scratch.
> 
> best
> 
> Andrew.
> 
> 
> -----Original Message-----
> From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Teresa Molés Cases
> Sent: 25 February 2016 09:50
> To: CWB at sslmit.unibo.it
> Subject: [CWB] Problem with lemma
> 
> Hi everyone,
> 
> We have a corpus indexed with CQPWeb and it works perfectly except in the case of lemma. That is to say (here an example for German):
> 
> [lemma = "gehen”] (it doesn’t work)
> [word = "gehen”] (it works)
> [pos = "NN”] (it works)
> 
> Although the whole process of compiling the corpus has been well designed, now it does not work in the case of lemma. Any ideas why? I am afraid this has to do with the process of indexing, but if you have any suggestions, I would be very thankful to you.
> 
> Best,
> 
> Teresa
> 
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb



More information about the CWB mailing list