[CWB] Re: [Corpora-List] corpus software

Stefan Evert stefanML at collocations.de
Fri Apr 23 21:30:31 CEST 2010


Dear corpora subscribers,

I'd like to use this opportunity to promote our public beta testing  
programme for the Open Corpus Workbench (CWB).

> In Menota (as in all corpora I have been involved in the development  
> of or,) the Corpus Linguist Workbench (CLW/CQP) from Univ. of  
> Stuttgart is the standard choice of corpus search system.  However,  
> CLW/CQP is old and has only been maintained and not developed the  
> last 10 years( I know ab out the open corpus workbench initative)

That's not quite true, even though progress has admittedly been slow  
and sporadic, and the official release of version 3.0 is more than 10  
years late by now ... :-}

However, many bug fixes and new features have been added to the CWB  
during this time, and since 2008 there are 64-bit versions  for Linux  
and Mac OS X that can handle corpora of up to 2 billion tokens.

>  For example the unicode support is meager.

We[1] are currently working on two new versions of the CWB, even  
though 3.0 has not _quite_ been released yet:

   v3.1 -- native Windows port based on work by the Textometrie project

   v3.2 -- full Unicode (UTF-8) support

Version 3.1 is ready for public beta testing, so we would like to ask  
any CWB users who are interested in the Windows platform (or have some  
time to spare and access to a Windows machine) to play around with it  
and discover all the bugs we haven't found yet.  Version 3.2 will  
follow soon (possibly in a less mature alpha release, so that we can  
test each new feature as it's added).

If you're interested in becoming a beta tester for the CWB, follow the  
instructions on this page:

	http://cwb.sourceforge.net/beta.php

Best regards, and thanks in advanced for helping us!
Stefan Evert & Andrew Hardie



[1] That is, Andrew Hardie is doing all the hard work, while I'm  
playing supervisor and giving instructions. :-)




More information about the CWB mailing list