[CWB] Appending text to an existing corpus

Nik cqplist at nikvdp.com
Thu Nov 8 16:04:36 CET 2012


I was afraid it would be something like that. Modifying the interface to
query multiple corpora at once is worth a shot, I'll give that a try.
Thanks for the help guys.


On Thu, Nov 8, 2012 at 5:56 PM, Hardie, Andrew <a.hardie at lancaster.ac.uk>wrote:

>  And now I see Stefan had already replied, in greater detail and more
> helpfully. Ooops!
>
>  Andrew.
>
>
>
> Nik <cqplist at nikvdp.com> wrote:
>
>
> Hi all,
> I have a pretty simple question: is there any way to append text to an
> existing corpus?
>
>  We're working on a corpus based on data collected from a webcrawler and
> would like to periodically  update the corpus with new data from the
> crawler. From the documentation I found info on how to add annotations to
> existing corpora etc., but I can't find anything about simply appending new
> data to an existing corpus.
>
>  Decoding the entire corpus, adding the new data to the generated file
> and re-encoding the new file is an option, but the server we're running on
> isn't exactly fast. Any way to save a few CPU cycles and directly insert
> the new data into the existing corpus? Perhaps there's some functionality
> to combine two corpora into one?
>
>  Thanks,
> Nik
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121108/abfee335/attachment.html>


More information about the CWB mailing list