[CWB] cwb-align-encode runs out of memory

Hardie, Andrew a.hardie at lancaster.ac.uk
Fri Jan 8 22:39:07 CET 2016


If you have a reasonably recent build then there will be a full account of –D –C etc. in man cwb-align-encode as well…

best

Andrew.

From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Yannick Versley
Sent: 08 January 2016 21:02
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] cwb-align-encode runs out of memory

Hi Jörg,

this is just generic advice, but the last time I had a problem of that sort, running
the program under valgrind gave some very helpful infos.

In the source, "-D" is explained as "use the directory of the source corpus"; considering
that it writes files for an alignment attribute that sounds rather reasonable.
Also judging from the source, "-C" makes it write a .alg file whereas otherwise it will
only write a .alx file (these have different requirements regarding ordering and contiguity,
it seems, and the comments seem a bit more positive about the "new" .alx file format)

Best,
Yannick

On Fri, Jan 8, 2016 at 8:38 PM, Jörg Tiedemann <Jorg.Tiedemann at lingfil.uu.se<mailto:Jorg.Tiedemann at lingfil.uu.se>> wrote:
Hi,

I have a large parallel corpus and I would like to add alignment information but cwb-align-encode seems to allocate a lot of memory and at some point it crashes. Is there any option to reduce memory consumption? What do the flags -D and -C do?

Thanks for helping!

Jörg

**********************************************************************************
Jörg Tiedemann
Department of Modern Languages             http://www.helsinki.fi/~tiedeman/
University of Helsinki


_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
http://devel.sslmit.unibo.it/mailman/listinfo/cwb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20160108/b3358f68/attachment-0001.html>


More information about the CWB mailing list