[CWB] Huffman code error
Hardie, Andrew
a.hardie at lancaster.ac.uk
Wed Oct 10 13:55:08 CEST 2012
I have the feeling this bug has come up before - can you check your version? (cqp -v)
thanks
Andrew.
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of BOFÍAS ALBERCH, EVA
Sent: 10 October 2012 12:19
To: cwb at sslmit.unibo.it
Subject: [CWB] Huffman code error
Hi,
I have an error, I am not able to solve. I'm trying to build a Latin corpora but I get this error:
Error: Huffman codes too long (32 bits, current maximum is 31 bits).
Please contact the CWB development team for assistance.
I got this error when trying to build a 40 words corpora (I cut it to see if I could detect the error; with 39 words I do not get the error)
-----------
<doc type="CHRISTIAN_LATIN" title="Abelard">
<s>
PETRUS Petrus N:nom
ABAELARDUS UNKNOUN ADJ
( ( PUN
1079-1142 card ADJ:NUM
) ) PUN
ABAELARDI UNKNOUN N:voc
AD UNKNOUN N:abl
AMICUM amicus ADJ
SUUM sus N:gen
CONSOLATORIA consolatorius ADJ
Sepe sepes N:dat
humanos humanus ADJ
affectus affectus N:nom
aut aut CC
provocant provoco V:IND
aut aut CC
mittigant mi V:IND
amplius ample ADV
exempla exemplum N:nom
quam qui REL
verba verbum N:nom
. . SENT
</s>
<s>
Unde unde ADV
post post PREP
nonnullam nonnullus ADJ
sermonis sermo N:gen
ad ad PREP
habiti habeo V:PTC
consolationem consolatio N:acc
, , PUN
de de PREP
ipsis ipse DET
calamitatum calamitas N:gen
mearum meus POSS
experimentis experimentum N:abl
</s>
</doc>
-----------------
This are the attributes I use to describe the corpus:
cat $SOURCEFILE | /usr/local/cwb-3.4.1/bin/cwb-encode -c utf8 -d $DATADIR -R $REGDIR/$CORPUSNAME -xsB -P lema -P pos -V s -S doc:0+type+title -S not:0+text
Thanks
Eva Bofias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121010/1530c2dc/attachment.html>
More information about the CWB
mailing list