[CWB] Manatee
Manuel Kountz
kountzml at ims.uni-stuttgart.de
Fri Feb 2 19:05:22 CET 2007
Stefan Evert schrieb:
> Hi Serge & all!
> I haven't tried compiling it and encoding a corpus, since there isn't
> even a short readme that would explain how to do this. If anyone has
> managed to get it to run and play around with it, I'd be very interested
> to hear about their impressions.
>
Not quite "play around", but I managed to install it (even into a custom
location, which means that their configure script gets --prefix right)
and encode a corpus. At least, the included "encodevert" binary tells me
that it had success encoding, I so far didn't get that bonito thingy to
work.
"encodevert" has a pitfall of looking into its default corpus directory
(or registry, I'm not quite sure which is which) first if it is given a
corpus name on command line, instead of looking into the directory
supplied with -p. The file I encoded was tagged using Helmut Schmid's
TreeTagger (thus vertical text, word/POS/lemma columns; it had
<s>...</s> sentence marks interspersed)
"manateesrv" takes a corpus directory (again I'm not sure what exactly
it wants) as argument and then reads commands from standard input. A
"reference" of *server* commands is src/manateesrv.cc itself, in the
CorpusManager::run method. I didn't even manage to tell it to do a
simple query, perhaps I'll have a look into bonito's guts sometime.
> ... It's definitely a much
> better piece of software engineering than the CWB and has fewer built-in
> limitations (unless you count having to install a suitable version of
> ICU in order to compile it as a liability).
>
There are some glitches in the configure scripts; one has to compile
either ICU or manatee itself with a suitable PCRE library around even
though some README or "configure --help" suggests PCRE was optional. And
then, well... there are many things hard-wired and many things seem to
be coded in a way that looks quite ad-hoc to me.
So much about my (admittedly quite limited) manatee experiences, I hope
that this is somewhat useful for you. Maybe I missed a lot of things
available via WWW or elsewhere, so this is not truly "definitive" ;*)
Best,
Manuel
More information about the CWB
mailing list