[CWB] Huffman code error
BOFÍAS ALBERCH, EVA
eva.bofias at upf.edu
Wed Oct 10 13:18:55 CEST 2012
Hi,
I have an error, I am not able to solve. I'm trying to build a Latin
corpora but I get this error:
Error: Huffman codes too long (32 bits, current maximum is 31 bits).
Please contact the CWB development team for assistance.
I got this error when trying to build a 40 words corpora (I cut it to see
if I could detect the error; with 39 words I do not get the error)
-----------
<doc type="CHRISTIAN_LATIN" title="Abelard">
<s>
PETRUS Petrus N:nom
ABAELARDUS UNKNOUN ADJ
( ( PUN
1079-1142 card ADJ:NUM
) ) PUN
ABAELARDI UNKNOUN N:voc
AD UNKNOUN N:abl
AMICUM amicus ADJ
SUUM sus N:gen
CONSOLATORIA consolatorius ADJ
Sepe sepes N:dat
humanos humanus ADJ
affectus affectus N:nom
aut aut CC
provocant provoco V:IND
aut aut CC
mittigant mi V:IND
amplius ample ADV
exempla exemplum N:nom
quam qui REL
verba verbum N:nom
. . SENT
</s>
<s>
Unde unde ADV
post post PREP
nonnullam nonnullus ADJ
sermonis sermo N:gen
ad ad PREP
habiti habeo V:PTC
consolationem consolatio N:acc
, , PUN
de de PREP
ipsis ipse DET
calamitatum calamitas N:gen
mearum meus POSS
experimentis experimentum N:abl
</s>
</doc>
-----------------
This are the attributes I use to describe the corpus:
cat $SOURCEFILE | /usr/local/cwb-3.4.1/bin/cwb-encode -c utf8 -d $DATADIR
-R $REGDIR/$CORPUSNAME -xsB -P lema -P pos -V s -S doc:0+type+title -S
not:0+text
Thanks
Eva Bofias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121010/b723f9f2/attachment.html>
More information about the CWB
mailing list