[CWB] encoding a corpus of Persian texts
S Mollaei
sf.mollaei at gmail.com
Tue Aug 28 11:41:07 CEST 2012
I tried to encode a Persian text with the instructions in
CWB_Encoding_Tutorial.pdf. so I changed the file name extension to
.vrt and .xml (extensions of the examples in tutorials).
I encoded the text, but it wasn't possible to search it (even the pos
tags). Then I used the .txt and after encoding I was able to search
the corpus in cmd command prompt using CQP.
Now the problem is this "Persian characters are not correctly
displayed and you can't type Persain and/or Arabic characters in cmd
command promt. Is there a solution for this?
More information about the CWB
mailing list