[CWB] cqpserver charset: Where can I set this variable?

Jörg Knappen j.knappen at mx.uni-saarland.de
Fri Apr 4 14:13:59 CEST 2014


Ah ... as I did not state it explicitly: I used the unchanged indexed  
corpora from
cwb-3.0.0. Is a re-indexing of the corpora required?

--Jörg Knappen

Zitat von Jörg Knappen <j.knappen at mx.uni-saarland.de>:

> I tried the most recent version (3.4.7 Revision 545). It compiled  
> smoothly on my
> SLES 11 SP 2 (after installing a few *-devel packages, as usual in  
> this setting).
>
> I started the cqpserver using the old initialisation file. This went.
>
> But: My client (SRUCQIBridge from CLARIN) was unable to connect to  
> cqpserver. It looks
> (in the debug output) like the login failed.
>
> I'll attach the debug output at the end of this mail.
>
> --Jörg Knappen
>
>
> End of things written to STD_ERR with -d ALL. Note that username and  
> password are
> both 'test' in my setting.
>
> [...]
> CQi: Connection established. Looking up client's name.
> CQi: ** new CQPserver created, initiating CQi session
> CQi: creating attribute hash (size = 16384)
> CQi: waiting for command
> CQi RECV BYTE 0x11
> CQi RECV BYTE 0x01
> CQi RECV BYTE 0x00
> CQi RECV BYTE 0x04
> CQi READ WORD   0004      [= 4]
> CQi RECV BYTE[4]
> CQi READ CHAR[] 'test'
> CQi RECV BYTE 0x00
> CQi RECV BYTE 0x04
> CQi READ WORD   0004      [= 4]
> CQi RECV BYTE[4]
> CQi READ CHAR[] 'test'
> CQi SEND WORD   0102      [= 258]
> CQi FLUSH
>
>
> Zitat von Stefan Evert <stefanML at collocations.de>:
>
>> On 2 Apr 2014, at 14:52, Jörg Knappen <j.knappen at mx.uni-saarland.de> wrote:
>>
>>> # corpus properties provide additional information about the corpus:
>>> ##:: charset  = "latin2" # character encoding of corpus data
>>> ##:: language = "pl"     # insert ISO code for language (de, en, fr, ...)
>>>
>>> However, the cqpserver still claims (verified using -d ALL) that the corpus
>>> in encoded in "latin1". It should announce "latin2" here ...
>>
>> Yes, the backend function is just a dummy.
>>
>>> Where does the cqpserver take the character set from, and how can  
>>> I modify this?
>>
>> This has been fixed in the current beta version (3.4 series).  We  
>> didn't backport the change, since CWB 3.0 doesn't properly support  
>> charsets other then latin1 anyway.
>>
>> You should consider upgrading to CWB 3.4; you'll probably have to  
>> compile it from the SVN repository in order to get a working  
>> CQI_CORPUS_CHARSET, since the patch was only added this February.
>>
>> Best,
>> Stefan
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb





More information about the CWB mailing list