[CWB] how to escape special characters in CQPweb

Hardie, Andrew a.hardie at lancaster.ac.uk
Fri Jun 14 21:36:36 CEST 2013


This is not a CQPweb thing but a general XML thing. To escape quote marks within an XML attribute value, you need to use the XML entity "

C escapes won't work at all in XML.

Best

Andrew.



Ray Wu <liangpingwu at 126.com> wrote:


hi all,

I'm preparing a parallel corpus for CQPweb. All things went well until I hit upon the double quotes.

Ok, I have a corpus like this,using \ (as in C) to escape the quotation mark in the translation:
<text id="test">
<s cn="\"亚洲\"的未来">
The    AT
Future    NN1
of    IO
Asia    NP1
</s>
</text>

When concordancing the corpus, I got the following:
The<http://124.193.83.252/cqp/paratest/context.php?batch=0&qname=e5o3dvdevx&uT=y> Future of Asia
 \

It seems that everything after the second quotation mark was silently ignored. However, if I change the input like this: <s cn="'非洲'的未来">, I would have
The<http://124.193.83.252/cqp/paratest/context.php?batch=0&qname=e5o3dvdevx&uT=y> Future of Asia
'亚洲'的未来

This is better but at the cost of changing the face of the original text. Does anyone know how to properly escape such special characters like quotation marks in CQPweb? Thanks.

Ray


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20130614/3b049387/attachment.html>


More information about the CWB mailing list