[CWB] cwb-align-import help.

Alberto Simões ambs at di.uminho.pt
Mon Feb 8 17:19:25 CET 2010


Hi

Supposedly cwb-align-import can be used to import pre-aligned corpora.
Unfortunately the documentation is not much and I can't find out how to
work with it.

As far as I've gone, I know I need to import source and target languages
as distinct corpora.

I used the <tu> tag to separate translation units on each side.
Therefore, I have the same number of translation units in each side.

I just do not understand how to write the alignment_beads.txt file.
Supposedly I will need some simple file, like
  1:1
  2:2
  3:3  (or whatever syntax).

Also, I am not sure if I need to add attributes to my <tu> tags in order
to have a number associated to each.

And, by the way, the -inverse option imports a pair of alignments
(source-target and target-source) or just the second?

I would say I need to use:

 cwb-align-import -l1 sourceCorpus -l2 targetCorpus -s tu

but no idea what to use for -k (is it needed?) and for the
alignment_beads.txt file contents.

Thanks
Alberto

-- 
Alberto Simões


More information about the CWB mailing list