[CWB] Re: Corpus size and filtering

Hardie, Andrew a.hardie at lancaster.ac.uk
Thu Apr 10 16:50:09 CEST 2008


Hi Rui,

BNCweb is pretty much entirely bound to the BNC as aspects of the
structure, markup etc. of the BNC are hard-coded into its scripts.
However, recently I have been working on creating a clone of BNCweb that
can be used as a front-end for CQP with any corpus. It's not anywhere
near ready for release yet, but the bit that handles subcorpora is
reasonably well developed. So you're welcome to make use of it if you
wish, although the setup takes a bit of work. Email me off list for
details if you're interested.

best

Andrew.

Andrew Hardie
Department of Linguistics and English Language
Bowland College, Lancaster University
Lancaster LA1 4YT
United Kingdom
 
a.hardie at lancaster.ac.uk 



-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it]
On Behalf Of Rui Chaves
Sent: 10 April 2008 15:03
To: cwb at sslmit.unibo.it
Subject: [CWB] Re: Corpus size and filtering

Hi,
thanks for all your precious help.

The BNCweb approach that you mention (constructing a SQL query from
the metadata restrictions, retrieving a list of matching texts from
its MySQL database, and then running the CQP query on the
corresponding  subcorpus) is exactly what we intend to do. We would
love to try this using perhaps the BNCweb (QCP edition), but it is not
clear to us how BNCweb (QCP edition) can be obtained and what issues
are raised when using a corpus other than the BNC. Is there a manual
online that we can consult?

There is a Corpus Encoding Tutorial for CWB available online at
http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CWBTutorial/cwb
-tutorial.pdf
but it seems to be a draft from 2002. Is there a more recent version
of this? It would be great if these were part of the sourceforge
package, along with the code.

Also, what sort of computer -- with regard to hardware -- would it be
advisable to run the above web interface for a corpus with 400M?


Many thanks,
Rui
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb


More information about the CWB mailing list