[CWB] CWB 3.2 beta testing the windows version

Hardie, Andrew a.hardie at lancaster.ac.uk
Sat Oct 30 01:49:05 CEST 2010


Hi Gertrud,

Issue 1 seems to be a problem with more running under cp 65001, witness what happened when I tried to run more on its own without cqp:

=================
H:\>chcp 65001
Active code page: 65001

H:\>more
Not enough memory.

H:\>
=================

So more simply doesn't seem to like working in UTF8 mode. Googling around suggests that a lot of Windows command line programs have this problem:
http://blogs.msdn.com/b/michkap/archive/2006/03/06/544251.aspx
(see 2nd comment)

Suggested workaround: install less for Windows and use that as the pager instead.
http://gnuwin32.sourceforge.net/packages/less.htm

Or, alternatively, try running CQP from within Windows PowerShell rather than cmd.exe. I would expect PowerShell to have better Unicode support.


Issue 2 may be a more serious problem. Could you send me (off-list) a screenshot?

Thanks

Andrew.


-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Gertrud Faasz
Sent: 28 October 2010 16:28
To: cwb at sslmit.unibo.it
Subject: [CWB] CWB 3.2 beta testing the windows version

Dear developers,
not a big thing, though, I should like to report the following,
happening with a CWB-installation on windows 7.
If you know about a work-around, please let me know.

Issue 1:
1st try: start cqp -e with a little UTF-8 corpus, do a simple query

Problem: works fine, however, utf-8 characters do not display correctly
(some other character like í appears)

2nd try: change terminal setting with "chcp 65001", start cqp again, do
the same query:

The message

"Nicht genügend Arbeitsspeicher.
Warning: Could not start pager 'more'. Paging disabled."

appears, followed by the results (not paged), however, these are now
displayed correctly.
---
Issue 2:
Next, I tried  "group Last match"-command: special characters are not
displayed (just empty boxes)
---

The problem is 100% reproducable; tried the settings a couple of times
and in variations. I can provide screen shots if that helps, please let
me know. And please note: there are no such problems when working with
latin-1 corpora; if entering "chcp 1252" before starting cqp, paging and
display of special latin-1 characters are as expected.

Sorry for the trouble & kind regards

Gertrud





_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb


More information about the CWB mailing list