[CWB] cwb development

Stefan Evert stefan.evert at uos.de
Fri Apr 14 23:10:24 CEST 2006


Lars Nygaard wrote:
> I have a suggestion for CWB documentation:
>
> * "CWB tutorial": ~8 pages on using CQP, ~2 pages on corpus encoding
>
Well, we've got the CQP query tutorial and the Corpus Encoding tutorial
(which needs to be filled in with a little more information). Taken
together, these two documents should be enough to get most people
started with the CWB. I think what you have in mind would be a kind of
"CWB quickstart guide" - definitely useful, perhaps something that you
or other users could contribute?

If anyone feels like doing this, you're welcome to copy bits & pieces
from the existing tutorials.  Once I get round to uploading the CWB to
sourceforge, I'll also put up the latex sources of the tutorials.
> * A book on CWB, consisting of:
>   - documentation for CQP users (using your tutorial as a starting point)
>   - documentation for CWB administrators (using your tutorial as a
> starting point)
>   - a chapter on interfacing with other programs
>     - RBDMs
>     - Perl
>     - CQi
>     - etc.
>   - a chapter on the techincal background (data structures, algorithms
> etc).
>   - a chapter on the history of the project
>
> The chapters should be open sourced, just like the rest of the
> project, but I think we should at least try to earn some CV brownie
> points for our work by publishing it. If we can't get any publishers
> to accept it (though I find that unlikely), at least we have
> documentation!
>
Now that's a great idea! And it would perhaps be an excellent way to get
people involved and encourage them to write down their experience with
the CWB or ideas for future developments (in the form of chapters or
sections of the book). I'm sure Marco would advise us very strongly
against trying to edit a book, but as a collaborative online development
- without deadlines and with a totally anarchic structure that can have
as many chapters as people like to write - I can imagine it to work out
and actually become useful very quickly (never mind if some of the
chapters are quirky or haven't been written at all, as long as one or
two helpful chapters are available).

How about putting the latex sourcecode up on sourceforge or another SVN
repository, then contributors can edit the text and check in updates as
they like. We could somehow arrange that PDF and HTML versions of the
"book" are compiled automatically at regular intervals and made
available online. We'd just have to advise contributors very strongly to
check in modifications only when they have made sure that the entire
book still compiles without errors. How does that sound?

Another solution would be to set up a wiki, but I prefer to have
documentation that I can download and print, and in my experience that's
always rather tricky with current wiki implementations.
> I'd like to volunteer to have an initial stab at organizing the
> documentation and filling out some of the blank spots (thought I would
> need your latex files ...).
That would be great! I'll try hard to release the CWB source code and
documentation sources before our developers meeting in May.
> Unfortunately I cannot attend CWBdev, but I will be happy to
> contribute by way of email and cvs ...
That's a pity, but it was to be expected that many potential
contributors wouldn't be able to pop down to Forli for a weekend. We'll
have a very small meeting probably, and it will be important that we
write down things - my presentations of CWB architecture, discussions
about future development, etc. - and circulate them on this mailing list
(and perhaps other lists as well).

Best wishes & a Happy Easter!
Stefan


More information about the CWB mailing list