[CWB] cwb-align-encode runs out of memory
Hardie, Andrew
a.hardie at lancaster.ac.uk
Fri Jan 8 22:39:07 CET 2016
If you have a reasonably recent build then there will be a full account of –D –C etc. in man cwb-align-encode as well…
best
Andrew.
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Yannick Versley
Sent: 08 January 2016 21:02
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] cwb-align-encode runs out of memory
Hi Jörg,
this is just generic advice, but the last time I had a problem of that sort, running
the program under valgrind gave some very helpful infos.
In the source, "-D" is explained as "use the directory of the source corpus"; considering
that it writes files for an alignment attribute that sounds rather reasonable.
Also judging from the source, "-C" makes it write a .alg file whereas otherwise it will
only write a .alx file (these have different requirements regarding ordering and contiguity,
it seems, and the comments seem a bit more positive about the "new" .alx file format)
Best,
Yannick
On Fri, Jan 8, 2016 at 8:38 PM, Jörg Tiedemann <Jorg.Tiedemann at lingfil.uu.se<mailto:Jorg.Tiedemann at lingfil.uu.se>> wrote:
Hi,
I have a large parallel corpus and I would like to add alignment information but cwb-align-encode seems to allocate a lot of memory and at some point it crashes. Is there any option to reduce memory consumption? What do the flags -D and -C do?
Thanks for helping!
Jörg
**********************************************************************************
Jörg Tiedemann
Department of Modern Languages http://www.helsinki.fi/~tiedeman/
University of Helsinki
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
http://devel.sslmit.unibo.it/mailman/listinfo/cwb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20160108/b3358f68/attachment-0001.html>
More information about the CWB
mailing list