[CWB] Huffman code error
BOFÍAS ALBERCH, EVA
eva.bofias at upf.edu
Thu Oct 11 10:59:24 CEST 2012
Sorry, I forgot to mentions my version.
Version: 3.0.2 (i download it when 3.0 was not available)
When I searched I had seen something similar, but it happens with very
large corpora, but I have this error even with a really tiny one.
Thanks
Eva
Message: 2
Date: Wed, 10 Oct 2012 11:55:08 +0000
From: "Hardie, Andrew" <a.hardie at lancaster.ac.uk>
To: Open source development of the Corpus WorkBench
<cwb at sslmit.unibo.it>
Subject: Re: [CWB] Huffman code error
Message-ID:
<28078EC3FBF1B940A3EF3D0D19BE351D0D38F6 at EX-0-MB1.lancs.local>
Content-Type: text/plain; charset="iso-8859-1"
I have the feeling this bug has come up before - can you check your
version? (cqp -v)
thanks
Andrew.
2012/10/10 BOFÍAS ALBERCH, EVA <eva.bofias at upf.edu>
> Hi,
>
> I have an error, I am not able to solve. I'm trying to build a Latin
> corpora but I get this error:
>
> Error: Huffman codes too long (32 bits, current maximum is 31 bits).
> Please contact the CWB development team for assistance.
>
> I got this error when trying to build a 40 words corpora (I cut it to see
> if I could detect the error; with 39 words I do not get the error)
>
> -----------
> <doc type="CHRISTIAN_LATIN" title="Abelard">
> <s>
> PETRUS Petrus N:nom
> ABAELARDUS UNKNOUN ADJ
> ( ( PUN
> 1079-1142 card ADJ:NUM
> ) ) PUN
> ABAELARDI UNKNOUN N:voc
> AD UNKNOUN N:abl
> AMICUM amicus ADJ
> SUUM sus N:gen
> CONSOLATORIA consolatorius ADJ
> Sepe sepes N:dat
> humanos humanus ADJ
> affectus affectus N:nom
> aut aut CC
> provocant provoco V:IND
> aut aut CC
> mittigant mi V:IND
> amplius ample ADV
> exempla exemplum N:nom
> quam qui REL
> verba verbum N:nom
> . . SENT
> </s>
> <s>
> Unde unde ADV
> post post PREP
> nonnullam nonnullus ADJ
> sermonis sermo N:gen
> ad ad PREP
> habiti habeo V:PTC
> consolationem consolatio N:acc
> , , PUN
> de de PREP
> ipsis ipse DET
> calamitatum calamitas N:gen
> mearum meus POSS
> experimentis experimentum N:abl
> </s>
> </doc>
>
> -----------------
> This are the attributes I use to describe the corpus:
>
> cat $SOURCEFILE | /usr/local/cwb-3.4.1/bin/cwb-encode -c utf8 -d $DATADIR
> -R $REGDIR/$CORPUSNAME -xsB -P lema -P pos -V s -S doc:0+type+title -S
> not:0+text
>
> Thanks
>
> Eva Bofias
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121011/e7f3f4ba/attachment-0001.html>
More information about the CWB
mailing list