R: Re: [CWB] other kind of annotations in cwb corpus

luigi.talamo at libero.it luigi.talamo at libero.it
Tue Feb 15 13:13:18 CET 2011


 Hi there! :)

yannick wrote:
>The (conceptually) simpler way to do this would be to dump the whole corpus
(using cwb-decode), run your favorite tools on it to get a version with the 
>additionalannotations, and then replace the old data directory with the cwb-
encode'd versionof your new, enriched version of the corpus.

Ok, I forgot to tell you that I'll probably start with a fresh corpus i.e. a 
corpus which is not encoded yet in cwb (and probably lacks any sort of xml 
encoding).
So, if I begin with a, say, bare txt, I'll only need to process it through cwb-
encode, right?
Which documentation should I read to prepare my collected data? In the 'alpha' 
version of my corpus I just want to have the additional annotation and the 
lemmatization: I can leave the pos tagging to further releases of the corpus. 
Is it possible to do that?

Thank you,

Luigi








More information about the CWB mailing list