[CWB] Giza ++ and CQPWeb

Stefan Evert stefanML at collocations.de
Mon Dec 12 11:17:54 CET 2016


> On 24 Nov 2016, at 16:55, Annarita Felici <Annarita.Felici at unige.ch> wrote:
> 
> I am planning to build a bidirectional parallel corpus of legal texts German-Italian/Italian-German.  For the alignment I was thinking of using Giza ++, but before embarking on this, I would like to know if I can import later one Giza alignment on CQPWeb.

CWB alignment attributes are designed for sentence-level alignment, not for word alignment.  Depending on how complex your word alignment is, the CWB mechanism could in theory be abused to store the alignment information, but its use in CWB would be severely limited and CQPweb would not be able to display it properly.

Are you sure that the automatic word alignment will be good enough for use in corpus searches?  If you're satisfied with a simple 1-to-1 alignment, you could possibly encode just the aligned words in a p-attribute and display it as a "gloss" in CQPweb.

Best,
Stefan


More information about the CWB mailing list