[CWB] other kind of annotations in cwb corpus

Hardie, Andrew a.hardie at lancaster.ac.uk
Tue Feb 15 13:22:33 CET 2011


You can have as many or as few word-level annotations as you like and
they can be whatever you like. CWB treats them all alike. 

It's common to have a POS but it's not necessary. 
It's common to have a lemma but it's not necessary. 

Other than that there are all manner of things that you might
conceivably have which are not so commonly used but are entirely
possible (inc. semantic tag, phonetic transcription, morphosyntactic
ambiguity fields, discourse annotation...). You're only limited by your
own ability to generate the annotations in the first place. CWB will
quite happily process whatever annotations you give it and will treat
them all exactly the same.

Best

Andrew.

> -----Original Message-----
> From: cwb-bounces at sslmit.unibo.it 
> [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of 
> luigi.talamo at libero.it
> Sent: 15 February 2011 12:13
> To: yversley at gmail.com; Open source development of the Corpus 
> WorkBench
> Subject: R: Re: [CWB] other kind of annotations in cwb corpus
> 
> 
>  Hi there! :)
> 
> yannick wrote:
> >The (conceptually) simpler way to do this would be to dump the whole 
> >corpus
> (using cwb-decode), run your favorite tools on it to get a 
> version with the 
> >additionalannotations, and then replace the old data 
> directory with the 
> >cwb-
> encode'd versionof your new, enriched version of the corpus.
> 
> Ok, I forgot to tell you that I'll probably start with a 
> fresh corpus i.e. a corpus which is not encoded yet in cwb 
> (and probably lacks any sort of xml encoding).
> So, if I begin with a, say, bare txt, I'll only need to 
> process it through cwb- encode, right?
> Which documentation should I read to prepare my collected 
> data? In the 'alpha' 
> version of my corpus I just want to have the additional 
> annotation and the
> lemmatization: I can leave the pos tagging to further 
> releases of the corpus. 
> Is it possible to do that?
> 
> Thank you,
> 
> Luigi
> 
> 
> 
> 
> 
> 
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
> 


More information about the CWB mailing list