[CWB] cwb-align-import help.
Alberto Simões
ambs at di.uminho.pt
Mon Feb 8 17:19:25 CET 2010
Hi
Supposedly cwb-align-import can be used to import pre-aligned corpora.
Unfortunately the documentation is not much and I can't find out how to
work with it.
As far as I've gone, I know I need to import source and target languages
as distinct corpora.
I used the <tu> tag to separate translation units on each side.
Therefore, I have the same number of translation units in each side.
I just do not understand how to write the alignment_beads.txt file.
Supposedly I will need some simple file, like
1:1
2:2
3:3 (or whatever syntax).
Also, I am not sure if I need to add attributes to my <tu> tags in order
to have a number associated to each.
And, by the way, the -inverse option imports a pair of alignments
(source-target and target-source) or just the second?
I would say I need to use:
cwb-align-import -l1 sourceCorpus -l2 targetCorpus -s tu
but no idea what to use for -k (is it needed?) and for the
alignment_beads.txt file contents.
Thanks
Alberto
--
Alberto Simões
More information about the CWB
mailing list