[CWB] [ cwb-Feature Requests-2806335 ] CQPweb: non-UTF-8 (ie Latin-1 and other 8-bit codepages)

SourceForge.net noreply at sourceforge.net
Mon Jun 15 01:48:49 CEST 2009


Feature Requests item #2806335, was opened at 2009-06-14 23:48
Message generated for change (Tracker Item Submitted) made by andrewhardie
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=722306&aid=2806335&group_id=131809

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: CQPweb
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Andrew Hardie (andrewhardie)
Assigned to: Andrew Hardie (andrewhardie)
Summary: CQPweb: non-UTF-8 (ie Latin-1 and other 8-bit codepages)

Initial Comment:
This is best accomplished by an (optional) filter at the CQP interface level, which switches (UTF-8) input from the web-scripts to ISO-8859 input for CQP (or, obviously, not) and then does reverse translation with strings returned from CQP.

This would need to be governed by a per-corpus setting that is passed to the CQP class upon calling the __construct method and becomes part of that class's setup, being checked by the CQP::execute() method (and any others that pass text back-and-forth).

Every script's call to CQP::__construct() would have to be modified for this.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=722306&aid=2806335&group_id=131809


More information about the CWB mailing list