[CWB] Cygwin

Eros Zanchetta eros at sslmit.unibo.it
Fri Dec 5 00:12:29 CET 2008


Stefan Evert wrote:
> Hi Eros!

Hi Stefan! ;-)

>> I just installed the latest version of CWB in my cygwin environment.
>
> That's great to hear! Can you tell us (put on the wiki etc.) how you
> got CQP to work in Cygwin?  Is there a new version of Cygwin or
> particular configuration tricks?

Yes, I was surprised too, I hadn't tried to run CQP in Cygwin in a long
time. I simply followed the instruction on the CQP wiki (it looks like
somebody updated the tutorial we wrote when I was in Osnabrück, but I
always assumed it was you), I compiled it and tested it with ITWAC3-01.
I didn't even re-index it or copy the corpus, I simply soft-linked it
from the ext3 partition on my home PC, and everything worked smoothly.

In all fairness I didn't conduct a thorough testing, I simply threw a
few rather expensive queries at CQP expecting a disaster, instead
everything worked smoothly.

> I remember when you were staying at Osnabrück, we got CQP to compile
> in Cygwin, but was dead slow and would quickly run out of memory. 
> That's still my current status when I tried within VirtualBox (Windows
> XP + Cygwin).  What system setup do you use?

I tested the setup on a three year old Windows XP SP3 box (Athlon 64 X2
3800, 2GB of RAM) with the latest Cygwin, so nothing special really.
Maybe they just improved the Cygwin memory management, I don't know.

> Yes, since it's used after drive letters (C: and all that), that's
> hardly surprising.  I would have expected Cygwin to be a little more
> intelligent about this, though ...

It's probably very low on their list of priorities, or it simply never
came up (who needs colons in filenames after all?)

> The ":" separator is hard-coded into CQP ... in many different
> places.  Most of the relevant code is in cqp/corpmanag.c, and there's
> a temptingly named macro "COLON" near the top of the file.  However,
> changing this #define will only break things, as the ":" character is
> hard-coded (without macro abstraction) in various other places -- most
> notably in the code that generates filenames for saved corpora.
>
> If there's a chance to get CQP to work reasonably well on Cygwin, I
> think it's worth reviewing the code to make the separator character
> configurable (or perhaps set it during the compilation, so it defaults
> to something else than ":" on Windows).  I'll have to go through the
> source code carefully to find out exactly where filenames are
> generated and parsed, and we'd need thorough beta testing on Cygwin.

Yeah, it looks like we need guinea pigs for a change. I don't work on
Windows much, I was just testing the new CQP API I'm developing (did I
mention I'm developing an API for CQP?) to see if it worked under
Windows (which incidentally it would, if it didn't rely heavily on the
"save" function...)

Anyway, as usual Cygwin is not a priority, but if you do find the time
to fix this, by all means let me know.

Cheers,
Eros


More information about the CWB mailing list