R: Re: [CWB] other kind of annotations in cwb corpus
luigi.talamo at libero.it
luigi.talamo at libero.it
Tue Feb 15 13:13:18 CET 2011
Hi there! :)
yannick wrote:
>The (conceptually) simpler way to do this would be to dump the whole corpus
(using cwb-decode), run your favorite tools on it to get a version with the
>additionalannotations, and then replace the old data directory with the cwb-
encode'd versionof your new, enriched version of the corpus.
Ok, I forgot to tell you that I'll probably start with a fresh corpus i.e. a
corpus which is not encoded yet in cwb (and probably lacks any sort of xml
encoding).
So, if I begin with a, say, bare txt, I'll only need to process it through cwb-
encode, right?
Which documentation should I read to prepare my collected data? In the 'alpha'
version of my corpus I just want to have the additional annotation and the
lemmatization: I can leave the pos tagging to further releases of the corpus.
Is it possible to do that?
Thank you,
Luigi
More information about the CWB
mailing list