[CWB] CWB
Stefan Evert
stefan.evert at uos.de
Fri Feb 2 16:11:25 CET 2007
Dear Cobus,
thanks for your interest in the CWB and your willingness to
contribute to further development! Uli Heid has just forwarded your e-
mail to me as well.
The answer to your first question is simple: mea culpa! I still
haven't found the time to clean up the CWB source code for release
(which includes writing a few minimal bits of documentation and
inserting copyright/license notices) and upload it to sourceforge
(I'm not very experienced with sf.net either, so it will probably
take me some time to get the SVN upload right).
The release is now rather firmly schedule for Feb 20 or so, since I
will have some time to focus on this task from around Feb 12.
Concerning your second question, in principle it should be possible
to compile and use the CWB in a Windows environment, although some
adjustments may be necessary. Most CWB users run some flavour of
Unix or have a dedicated Linux server for their CWB installation, but
we have experimented a little with CWB on Windows using the Cygwin
emulation layer. Basically, it seems to compile and run, but
performance is very poor. Insofar as we can guess at the moment,
this may have to do with CWB's reliance on memory-mapping to access
data files efficiently and with the way memory-mapping is emulated by
Cygwin.
It would be great to have someone work on a native Windows port. As
far as I understand, Win32 _does_ support memory-mapping in an
efficient way, so I suspect it's just the Cygwin implementation that
causes our speed problems, and the CWB should run much better when
compiled in a standard Windows environment. The port should not be
too difficult, since CWB doesn't rely very heavily on Unix
technology. Basically, it needs a standard C compiler (GCC
recommended), POSIX-compatible system libraries (including, most
importantly, the mmap() function for memory-mapping files), and a few
standard Unix command-line utilities (such as "less" or "gzip"),
which are also available for Windows or could be replaced by built-in
implementations.
I'll announce availability of the source code on this mailing list!
Best regards,
Stefan Evert
On 1 Feb 2007, at 09:40, Cobus Conradie wrote:
> I'm writing on behalf of the Center for Text Technology at the
> North-West University, South Africa. We have quite a few projects
> lined up for the following three years mainly on African languages
> and need to get these Corpora in some kind of structure for
> reference and manipulation. The Corpus WorkBench seems to be just
> the thing. Unfortunately I’m from the windows world and know very
> little of Sourceforge.net.
>
> We would like to contribute in the form of development and
> participation in the project. First question is, I can’t seem to
> find the source for the project. I went to http://
> cwb.svn.sourceforge.net/viewvc/cwb/ but can’t find anything there.
> The second question is if there is a Windows version of this
> product or at least the possibility of porting it to a Windows
> environment? (without rewriting the whole thing)
> Hope to hear from you soon.
>
> Regards
>
More information about the CWB
mailing list