[CWB] encoding a corpus of Persian texts

S Mollaei sf.mollaei at gmail.com
Tue Aug 28 11:41:07 CEST 2012


I tried to encode a Persian text with the instructions in
CWB_Encoding_Tutorial.pdf. so I changed the file name extension to
.vrt and .xml (extensions of the examples in tutorials).
I encoded the text, but it wasn't possible to search it (even the pos
tags). Then I used the .txt and after encoding I was able to search
the corpus in cmd command prompt using CQP.

Now the problem is this "Persian characters are not correctly
displayed and you can't type Persain and/or Arabic characters in cmd
command promt. Is there a solution for this?


More information about the CWB mailing list