[CWB] encoding a corpus of Persian texts

mdecorde matthieu.decorde at ens-lyon.fr
Wed Aug 29 08:33:25 CEST 2012


Hi,

Command line Terminal is not the only way to access CQP.
TXM is one of them: https://sourceforge.net/projects/txm

It includes a GUI for the full Unicode CQP version 3.4.1 for
Windows, Mac OS X and Linux.

It begins to provide some services for RTL writing systems,
see the "arabic language" entry of the TXM FAQ for details (in French):
https://groupes.renater.fr/wiki/txm-users/public/faq#txm_peut_il_traiter_des_corpus_de_textes_arabes

Best,
the Textometry team

Le mardi 28 août 2012 à 14:11 +0430, S Mollaei a écrit :
> I tried to encode a Persian text with the instructions in
> CWB_Encoding_Tutorial.pdf. so I changed the file name extension to
> .vrt and .xml (extensions of the examples in tutorials).
> I encoded the text, but it wasn't possible to search it (even the pos
> tags). Then I used the .txt and after encoding I was able to search
> the corpus in cmd command prompt using CQP.
> 
> Now the problem is this "Persian characters are not correctly
> displayed and you can't type Persain and/or Arabic characters in cmd
> command promt. Is there a solution for this?
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb




More information about the CWB mailing list