[CWB] Character encoding revisited
Josep M. Fontana
josepm.fontana at upf.edu
Wed Jun 25 20:44:17 CEST 2014
I'm not sure I follow you. I'm not using Putty since I don't have
Windows. From what you are saying, though, I infer the problem you had
had to do with how you viewed the characters on the terminal, right?
This is not the problem I have, though, because I transfer the text file
to my local computer and I can see that the text encoding is screwed up.
Thanks for your help, anyway. Gràcies!
> Hi, Josep M.,
> I'm not sure if this will help you, but I had a similar problem and
> when accessing Putty, before validating, in the 'Translation' pannel
> (inside 'Window'), I changed manually the encoding character and
> everything worked perfectly after this. Good luck!
> El 25/06/2014, a las 18:41, Josep M. Fontana <josepm.fontana at upf.edu
> <mailto:josepm.fontana at upf.edu>> escribió:
>> Our corpus is encoded in UTF-8 but when I create a text file with the
>> output of some search I get the typical odd characters one gets when
>> the conversion has gone wrong. I used the 'file' command and I saw
>> that the text files are sometimes encoded as ISO-8859 and some other
>> times as ASCII. Is there anyway to configure things so that the UTF-8
>> character set is maintained? Thanks.
>> Josep M.
>> CWB mailing list
>> CWB at sslmit.unibo.it <mailto:CWB at sslmit.unibo.it>
> CWB mailing list
> CWB at sslmit.unibo.it
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CWB