<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:"Verdana",sans-serif;
        color:#1F497D;
        font-weight:normal;
        font-style:normal;
        text-decoration:none none;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body bgcolor="white" lang="EN-GB" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">Hi Graham,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">&gt;&gt;</span> I have not been able to find the English and German Holmes files used in Stefan Evert's tutorial<span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">They aren’t currently in the release package (which I believe is probably in need of an update!) but they are in the SVN tree:
</span><a href="https://sourceforge.net/p/cwb/code/HEAD/tree/doc/corpora/encoding_tutorial_data/">https://sourceforge.net/p/cwb/code/HEAD/tree/doc/corpora/encoding_tutorial_data/</a><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">&gt;&gt;</span> what exactly is the required input format for the cwb-align command?<span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">cwb-align is an aligner, you don’t need to use it if you are dealing with pre-aligned data. Its
<b>output</b> format is what you need to generate (to then input into cwb-align-encode, the program that actually creates the a-attribute). This means in the tutorial, you can skip to sec 8.4.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">The format necessary (called “.align file”) is described in the section “OUTPUT FORMAT” of
<b>man cwb-align</b>. Pasted below. &nbsp;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">You would need to output the begin/end points of the s attributes in the first corpus (using cwb-s-decode), and then combine that with
 the begin/end points of the s attribute in the second corpus, to get the 4-tuples needed by cwb-align-encode. And then you’d need to add
<b>1:1</b> at the end of each line. That 5 column file defines the alignment.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">&gt;&gt;</span> If I have .vrt files created in two languages with treetagger, and if I have prealigned these, in such a way that the first
 sentence of one file corresponds to the first sentence of the other, the second sentence to the second, etc. then is that enough?<span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">Yes, that is enough if you use the method above to create the input file for cwb-align-encode.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">&gt;&gt;</span> Or should my files also including numerical information with all sentences numbered?<span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">If your ranges do actually have unique ID attributes &lt;s id=”..”&gt; or maybe &lt;s n=”..”&gt;, you can optionally use
<b>cwb-align-import</b> instead of cwb-align-encode: see section 8.5 . This has a different input format than cwb-align-encode: it identifies matching segments by some ID attribute, rather than by spelling out the corpus token position ranges, as in cwb-align-encode.
 This is called a <b>bead file</b> whereas the other is called an <b>.align file</b> (not the clearest terminology I know).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">Your sentences would have to be numbered through your whole corpus. In which case, your bead file could look like this:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">CORP1&nbsp;&nbsp;&nbsp; CORP2&nbsp;&nbsp;&nbsp; s&nbsp;&nbsp;&nbsp;&nbsp; {n}<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">[…]
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">&nbsp;which might be easier to auto-generate than the cwb-align-encode format based on raw corpus position numbers.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">Finally, whatever method you use, don’t forget the need to update the registry file with the new a-attribute.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">best<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">Andrew.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">=====================<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US">From man cwb-align:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">OUTPUT FORMAT<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cwb-align's output file uses CWB's &quot;.align&quot; file format. &quot;.align&quot; files are ASCII text files<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (although they may contain characters from another encoding if the corpus IDs include non-ASCII<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; characters), formatted as follows.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The first line is a header line, which contains the following four elements, separated by tabs:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;·&nbsp;&nbsp; The ID of the source corpus<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The ID of the aligned s-attribute (the grid attribute - see above)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The ID of the target corpus<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The ID of the aligned s-attribute (repeated)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Following the header, each individual line represents a single pair of aligned regions in the<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; corpus.&nbsp; This is specified by six fields of information, separated by tabs. The six fields are<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; as follows:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The beginning of the range in the source corpus (expressed as a cpos, i.e. a token number)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The end of the range in the source corpus (expressed as a cpos)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The beginning of the range in the target corpus (expressed as a cpos)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The end of the range in the target corpus (expressed as a cpos)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The type of alignment: 1:1, 2:1, 1:2 or 2:2<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ·&nbsp;&nbsp; The quality of the alignment: a score calculated by the alignment engine<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; For example,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 140&nbsp;&nbsp;&nbsp; 169&nbsp;&nbsp;&nbsp; 137&nbsp;&nbsp;&nbsp; 180&nbsp;&nbsp;&nbsp; 1:2<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; means that corpus position ranges [140,169] and [137,180] form a 1:2 alignment pair.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (The final field, the quality, is optional in this file format, and is absent in the example<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; above; however, cwb-align will always provide it.)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Courier New&quot;;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:&quot;Verdana&quot;,sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p>&nbsp;</o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="color:windowtext">From:</span></b><span lang="EN-US" style="color:windowtext"> cwb-bounces@sslmit.unibo.it &lt;cwb-bounces@sslmit.unibo.it&gt;
<b>On Behalf Of </b>Graham Ranger -- UAPV<br>
<b>Sent:</b> 07 May 2019 13:57<br>
<b>To:</b> Open source development of the Corpus WorkBench &lt;cwb@sslmit.unibo.it&gt;<br>
<b>Subject:</b> [CWB] Aligning parallel corpora<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">Hello to all,<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<p class="MsoNormal">I have set up a parallel corpus on cqpweb using s-attributes for the visualisation of translations but I would like to be able to do the same thing more cleanly, using alignment attributes. However, try as I might, I cannot seem to follow
 the instructions in the encoding tutorial. I have not been able to find the English and German Holmes files used in Stefan Evert's tutorial for illustration. Now, what I would like to know is: what exactly is the required input format for the cwb-align command?
 If I have .vrt files created in two languages with treetagger, and if I have prealigned these, in such a way that the first sentence of one file corresponds to the first sentence of the other, the second sentence to the second, etc. then is that enough? Or
 should my files also including numerical information with all sentences numbered? I suspect this is a very naive question, but it's one that I do not seem to be able to find my way around without help!<br>
<br>
<o:p></o:p></p>
<p class="MsoNormal" align="right" style="text-align:right">Best,<br>
<br>
<o:p></o:p></p>
<p class="MsoNormal">Graham.<o:p></o:p></p>
</div>
</div>
</body>
</html>