[CWB] A question about the aligning using cwb-encoding
Munich LEE
leemh at yonsei.ac.kr
Mon Jan 27 00:15:25 CET 2014
Hi,
I am building an English-Korean bilingual corpus using cwb-align-encode.
So, I encoded and aligned.
At firts it seemed that it worked.
However I found a problem, when I checked the search results.
Some first sentences were aligned as right pairs.
But the others were not.
It seems to be related with statistical aligning process.
Actually I made two corpora so, that every pair sentence should have the same sentence id like or , in order to avoid the failure of statistical alignment.
I am working with 60000 sentences. And I manually aligned all sentences and put the information into the xml tag "s_id".
My question is how I can make useful the manually created xml tag "s_id"?
Could anyone help me?
I will appreciate your support.
Thanks.
Munich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20140127/0de7cdf1/attachment.html>
More information about the CWB
mailing list