[CWB] Problem with cwb-make
oyvind.eide at iln.uio.no
oyvind.eide at iln.uio.no
Thu May 10 10:50:33 CEST 2012
On 10.05.2012 10:30, Stefan Evert wrote:
>
> On 10 May 2012, at 10:14, oyvind.eide at iln.uio.no wrote:
>
>> I have now encountered another problem, which may look more like a bug. If you look at the transcript below, encoding the corpus works fine. Then cwb-make fails, it seems to be on line 423 of Encoder.pm. The corpus still works, but there seems to be at least one file missing.
>>
>> Any tips? In need of more information? The server is running linux RHEL 5.8.
>
> Can you send us a sample of the input file? It is rather strange to have p-attributes for "file" and "id". Is it possible that the value of "file" is always the same? The compression algorithm might fail on such a boundary case (I'm not sure whether it does or not).
These are values we may or may not need for the web system later. I
removed the file (which were, indeed, only one value for this limited
test) and the problem was solved.
So it seems to be the case that the algorithm fails when a field has the
same value for all tokens.
>
> When you show file listing, always use "ls -l" please. File sizes and access permissions might give use valuable hints for identifying potential problems.
I will remember that for the next round.
>
> Best,
> Stefan
>
Thank you for a quick solution!
Regards,
Øyvind
More information about the CWB
mailing list