No subject
Fri Oct 19 03:52:45 CEST 2012
orpora etc., but I can't find anything about simply appending new data to a=
n existing corpus.
Decoding the entire corpus, adding the new data to the generated file and r=
e-encoding the new file is an option, but the server we're running on isn't=
exactly fast. Any way to save a few CPU cycles and directly insert the new=
data into the existing corpus? Perhaps there's some functionality to combi=
ne two corpora into one?
Thanks,
Nik
--_000_4vgvegvu3mx22ote8ywp8lyq1352368286567emailandroidcom_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-8859-=
1">
</head>
<body>
And now I see Stefan had already replied, in greater detail and more helpfu=
lly. Ooops!
<div><br>
</div>
<div>Andrew.</div>
<br>
<br>
<br>
Nik <cqplist at nikvdp.com> wrote:<br>
<br>
<br>
<div>Hi all,
<div>I have a pretty simple question: is there any way to append text to an=
existing corpus?</div>
<div><br>
</div>
<div>We're working on a corpus based on data collected from a webcrawler an=
d would like to periodically update the corpus with new data from the=
crawler. From the documentation I found info on how to add annotations to =
existing corpora etc., but I can't find
anything about simply appending new data to an existing corpus. </div=
>
<div><br>
</div>
<div>Decoding the entire corpus, adding the new data to the generated file =
and re-encoding the new file is an option, but the server we're running on =
isn't exactly fast. Any way to save a few CPU cycles and directly inse=
rt the new data into the existing corpus?
Perhaps there's some functionality to combine two corpora into one?</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Nik</div>
</div>
</body>
</html>
--_000_4vgvegvu3mx22ote8ywp8lyq1352368286567emailandroidcom_--
More information about the CWB
mailing list