[CWB] encoding a corpus of Persian texts
Hardie, Andrew
a.hardie at lancaster.ac.uk
Tue Aug 28 11:49:40 CEST 2012
RE
>>> I tried to encode a Persian text with the instructions in CWB_Encoding_Tutorial.pdf.
>>> so I changed the file name extension to .vrt and .xml (extensions of the examples in tutorials).
The file extension is immaterial, .vrt is just a convention, no more.
What matters is the internal format of the file.
RE
>>> Now the problem is this "Persian characters are not correctly displayed
>>> and you can't type Persain and/or Arabic characters in cmd command promt.
>>> Is there a solution for this?
This is a limitation of cmd terminal, I would imagine. Some users have had success
at fixing this using the steps laid out here:
http://cwb.sourceforge.net/faq.php?hoist=windows_terminal#windows_terminal
Good luck!
best
Andrew.
-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of S Mollaei
Sent: 28 August 2012 10:41
To: cwb
Subject: [CWB] encoding a corpus of Persian texts
I encoded the text, but it wasn't possible to search it (even the pos tags). Then I used the .txt and after encoding I was able to search the corpus in cmd command prompt using CQP.
Now the problem is this "Persian characters are not correctly displayed and you can't type Persain and/or Arabic characters in cmd command promt. Is there a solution for this?
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list