<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><font size="+1">Thanks, Stefan!</font><br>
</p>
<pre class="moz-signature" cols="72">José Manuel Martínez Martínez
<a class="moz-txt-link-freetext" href="https://chozelinek.github.io">https://chozelinek.github.io</a></pre>
<div class="moz-cite-prefix">On 02.08.19 10:34, Stefan Evert wrote:<br>
</div>
<blockquote type="cite"
cite="mid:D8A788C1-D77A-4CF9-B2C4-E8E7D599BF82@collocations.de">
<pre class="moz-quote-pre" wrap="">
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">I have a question for those with experience with parallel corpora. Say that I've spotted in a parallel corpus a mistake in the alignment of one text. Is it possible to import the right alignment only for that text using cwb-align-import? Or do I have to import the alignment for all the texts in the corpus.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
CWB annotation (including alignments) cannot be updated once it has been encoded. You will have to fix the error in the alignment source and then re-encode the complete alignment attribute.
(An exception to this rule is that cwb-s-encode allows you to update s-attributes, but that simply means it automatically merges the new data and re-encodes the attributes.)
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Is it possible to dump the alignments already encoded somehow?
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
At a low level, you can use cwb-align-decode to dump the alignment attribute as a sequence of region pairs. Then edit the file manually to adjust the corpus positions of the incorrect alignment and re-encode with cwb-align-encode, overwriting the previous data in the corpus.
Alternatively, use cwb-align-export from the CWB/Perl package to export the alignment in terms of sets of sentence IDs. Read the manpage (perldoc cwb-align-export) and work out how to construct appropriate sentence IDs that correspond to your original input file. After manually correcting the errors, you should be able to re-encode the alignment with cwb-align-import.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">The thing is that the alignments used to create the file imported by cwb-align-import do not exist anymore. I'd like to avoid realigning the whole corpus, just to fix a few errors.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
In that case, I hope that you're going to make a backup copy of the corpus before fiddling with the alignment …
Best,
Stefan
_______________________________________________
CWB mailing list
<a class="moz-txt-link-abbreviated" href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a>
<a class="moz-txt-link-freetext" href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a>
</pre>
</blockquote>
</body>
</html>