[CWB] Corpus Design

Hardie, Andrew a.hardie at lancaster.ac.uk
Sun May 11 16:29:44 CEST 2008


Hi Lau,
 
May I recommend starting here:
 
http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQPTutorial/html/cqp-tutorial.html
 
with particular attention to this page
 
http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQPTutorial/html/node4.html
 
and then have a look here
 
http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CWBTutorial/cwb-tutorial.pdf
 
and then here
 
http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQPUserManual/HTML/cqpman.html
 
(these are the links that I find myself returning to repeatedly).
 
My suggestion would be to start by getting a handle on the data model; then try indexing a single, small text without annotation. 
 
Once that is OK, try indexing a larger corpus with maybe POS annotation and some XML tags. Then try running some queries using the commandline CQP. 
 
Only then will you be at the point where you can start to worry about web interfaces!
 
best
 
Andrew.
 
 
 
Andrew Hardie
Department of Linguistics
Bowland College
Lancaster University
Lancaster LA1 4YT
 
a.hardie at lancaster.ac.uk <mailto:a.hardie at lancaster.ac.uk> 
 

________________________________

From: cwb-bounces at sslmit.unibo.it on behalf of Lautenai Jr.
Sent: Sun 11/05/2008 15:02
To: cwb at sslmit.unibo.it
Subject: [CWB] Corpus Design


Dear all,

I am new on IMS CWB and Computational Corpus Development, but I am trying to get progress on it.
First of all, I would like to have a description of the procedures to compile a corpus for iindexing in IMS CWB, process like, Tokenise, POS and other, in a sequence. Secondly, What is the process to index in IMS CWB and work with a web interface?
I have read many documentations about, all those have many descriptions on corpus processing, but not in a order to do.
About computer competence, no wories, I can do it. However I would like to know about the process and the tools I need to use to do.

Thanks, 

Lau

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 5896 bytes
Desc: not available
Url : http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20080511/6ae82e46/attachment.bin


More information about the CWB mailing list