[CWB] Huffman code error

BOFÍAS ALBERCH, EVA eva.bofias at upf.edu
Thu Oct 11 10:59:24 CEST 2012


Sorry, I forgot to mentions my version.

Version:   3.0.2 (i download it when 3.0 was not available)

When I searched I had seen something similar, but it happens with very
large corpora, but I have this error even with  a really tiny one.

Thanks
Eva

Message: 2
Date: Wed, 10 Oct 2012 11:55:08 +0000
From: "Hardie, Andrew" <a.hardie at lancaster.ac.uk>
To: Open source development of the Corpus WorkBench
        <cwb at sslmit.unibo.it>
Subject: Re: [CWB] Huffman code error
Message-ID:
        <28078EC3FBF1B940A3EF3D0D19BE351D0D38F6 at EX-0-MB1.lancs.local>
Content-Type: text/plain; charset="iso-8859-1"

I have the feeling this bug has come up before - can you check your
version? (cqp -v)

thanks

Andrew.

2012/10/10 BOFÍAS ALBERCH, EVA <eva.bofias at upf.edu>

> Hi,
>
> I have an error, I am not able to solve. I'm trying to build a Latin
> corpora but I get this error:
>
> Error: Huffman codes too long (32 bits, current maximum is 31 bits).
>        Please contact the CWB development team for assistance.
>
> I got this error when trying to build a  40 words corpora (I cut it to see
> if I could detect the error; with 39 words I do not get the error)
>
> -----------
> <doc type="CHRISTIAN_LATIN" title="Abelard">
> <s>
> PETRUS    Petrus    N:nom
> ABAELARDUS    UNKNOUN    ADJ
> (    (    PUN
> 1079-1142    card    ADJ:NUM
> )    )    PUN
> ABAELARDI    UNKNOUN   N:voc
> AD    UNKNOUN    N:abl
> AMICUM    amicus    ADJ
> SUUM    sus    N:gen
> CONSOLATORIA    consolatorius    ADJ
> Sepe    sepes    N:dat
> humanos    humanus    ADJ
> affectus    affectus    N:nom
> aut    aut    CC
> provocant    provoco    V:IND
> aut    aut    CC
> mittigant   mi    V:IND
> amplius    ample    ADV
> exempla    exemplum    N:nom
> quam    qui    REL
> verba    verbum    N:nom
> .    .    SENT
> </s>
> <s>
> Unde    unde    ADV
> post    post    PREP
> nonnullam    nonnullus    ADJ
> sermonis    sermo    N:gen
> ad    ad    PREP
> habiti  habeo    V:PTC
> consolationem    consolatio    N:acc
> ,    ,    PUN
> de    de    PREP
> ipsis    ipse    DET
> calamitatum    calamitas    N:gen
> mearum    meus    POSS
> experimentis    experimentum    N:abl
> </s>
> </doc>
>
> -----------------
> This are the attributes I use to describe the corpus:
>
> cat $SOURCEFILE | /usr/local/cwb-3.4.1/bin/cwb-encode -c utf8 -d $DATADIR
> -R $REGDIR/$CORPUSNAME -xsB -P lema -P pos -V s  -S doc:0+type+title -S
> not:0+text
>
> Thanks
>
> Eva Bofias
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121011/e7f3f4ba/attachment-0001.html>


More information about the CWB mailing list