[CWB] copying alignments

Stefan Evert stefanML at collocations.de
Wed Dec 16 09:31:43 CET 2015


> I have the same corpus two times, with different annotations, including different alignments.

To be clear, these are two versions of the same corpus with exactly the same tokenization, but different annotations?  Then it should be ok simply to copy the alignment files.

> Now I wanted to copy an alignment from one to the other, i.e., add it to the other corpus. What I did is I simply copied the rng and alx files, and added a line to the registry (ALIGNED corpus).

The alignment attribute only consists of the .alx file, so you don't need to copy any .rng ones unless you're also using them for some other purpose.

> However, that did not work. Would it be expected to? How could one do such a thing? It would save me quite a bit of time.

What does "cwb-describe-corpus -s" show on the second corpus. And what exactly do you mean by "did not work"?

As a wild guess, I'd suppose that the second corpus is aligned to a different version of the target corpus under a different CWB name.  Then you have to adjust the name of the alignment attribute and the corresponding index file accordingly, of course.

Generally, a safer strategy is to cwb-align-decode the alignment attribute, change the corpus names in the header line, and cwb-align-encode it for the new corpus.

Best,
Stefan





More information about the CWB mailing list