[CWB] Manatee

Manuel Kountz kountzml at ims.uni-stuttgart.de
Fri Feb 2 19:05:22 CET 2007


Stefan Evert schrieb:
> Hi Serge & all!

> I haven't tried compiling it and encoding a corpus, since there isn't 
> even a short readme that would explain how to do this.  If anyone has 
> managed to get it to run and play around with it, I'd be very interested 
> to hear about their impressions.
>
Not quite "play around", but I managed to install it (even into a custom 
location, which means that their configure script gets --prefix right) 
and encode a corpus. At least, the included "encodevert" binary tells me 
that it had success encoding, I so far didn't get that bonito thingy to 
work.

"encodevert" has a pitfall of looking into its default corpus directory 
(or registry, I'm not quite sure which is which) first if it is given a 
corpus name on command line, instead of looking into the directory 
supplied with -p. The file I encoded was tagged using Helmut Schmid's 
TreeTagger (thus vertical text, word/POS/lemma columns; it had 
<s>...</s> sentence marks interspersed)

"manateesrv" takes a corpus directory (again I'm not sure what exactly 
it wants) as argument and then reads commands from standard input. A 
"reference" of *server* commands is src/manateesrv.cc itself, in the 
CorpusManager::run method. I didn't even manage to tell it to do a 
simple query, perhaps I'll have a look into bonito's guts sometime.

> ...  It's definitely a much 
> better piece of software engineering than the CWB and has fewer built-in 
> limitations (unless you count having to install a suitable version of 
> ICU in order to compile it as a liability).
> 

There are some glitches in the configure scripts; one has to compile 
either ICU or manatee itself with a suitable PCRE library around even 
though some README or "configure --help" suggests PCRE was optional. And 
then, well... there are many things hard-wired and many things seem to 
be coded in a way that looks quite ad-hoc to me.

So much about my (admittedly quite limited) manatee experiences, I hope 
that this is somewhat useful for you. Maybe I missed a lot of things 
available via WWW or elsewhere, so this is not truly "definitive" ;*)

Best,

    Manuel


More information about the CWB mailing list