<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Dear Vlado,<br>
interesting, that explains why the spaces are still there in my
corpus. Where do I turn them off? <br>
<br>
I think for those users that need to copy text it's really a
needed function.<br>
<br>
Best!<br>
Ruprecht<br>
<br>
Am 30.03.2018 um 18:25 schrieb Vladimír Benko:<br>
</div>
<blockquote type="cite"
cite="mid:84c7efcb-6b67-029e-17ce-7054be218779@uniba.sk">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div class="moz-cite-prefix">Dear All,<br>
<br>
</div>
<blockquote type="cite"
cite="mid:0bf0eef9-57f9-16ad-b09b-f0bb6d644cb4@ff.cuni.cz">Just
a small rectification re: <code style="font-size: 0.85em; font-family: Consolas,Inconsolata,Courier,monospace;margin: 0px 0.15em; padding: 0px 0.3em; white-space: pre-wrap; border: 1px solid rgb(234, 234, 234); background-color: rgb(248, 248, 248); border-radius: 3px; display: inline;"><g/></code>
in Manatee/Bonito: turns out it <em>is</em> an opt-in
configuration after all, cf. <a
href="https://groups.google.com/a/sketchengine.co.uk/d/msg/noske/lYHa3WSb4L8/6ycvtxCYAwAJ"
moz-do-not-send="true">https://groups.google.com/a/sketchengine.co.uk/d/msg/noske/lYHa3WSb4L8/6ycvtxCYAwAJ</a>.
Sorry if I misled anyone earlier, I don’t use the feature
myself, so I only had a vague recollection it was somehow there.
And apologies to any (No)SkE devs who might be subscribed to the
CWB list — this is actually a nice and clean way to do it :)</blockquote>
<br>
The <g/> feature in (No)SkE can be opted in at two levels:
Firstly, by including or the <g/> structure into the source
vertical (this must be performed during tokenization), and
defining it in the respective corpus configuration file, the
corpus designer decides that the original appearance of spaces is
preserved. And secondly, any corpus user can decide whether the
<g/> structures are to be interpreted (which is bit
misleadingly called "displayed").<br>
<br>
In our (No)SkE installations, we prefer preserving information
about spaces for the text displayed on the screen, as two main
groups of our corpora (lexicographers and students of foreign
languages) typically need to copy longer texts fragments, which
otherwise would require manual editing.<br>
<br>
I admit, however, that use of <g/>'s may also confuse corpus
users, as some token boundaries become "hidden" and tonenization
policy is less apparent :-)<br>
<br>
Best regards,<br>
<br>
Vlado B, 18:20<br>
<br>
<br>
<div class="moz-signature">-- <br>
<font color="navy">Vladimír Benko</font>
<p> Université Comenius de Bratislava<br>
Chaire UNESCO de communication<br>
plurilingue et multiculturelle</p>
<p> Šafárikovo námestie 6, SK-81499 Bratislava</p>
<p> <a class="moz-txt-link-freetext"
href="http://unesco.uniba.sk/guest/" moz-do-not-send="true">http://unesco.uniba.sk/guest/</a><br>
<a class="moz-txt-link-freetext"
href="https://www.facebook.com/araneawebcorpora/"
moz-do-not-send="true">https://www.facebook.com/araneawebcorpora/</a><br>
<a class="moz-txt-link-freetext"
href="https://vk.com/araneawebcorpora"
moz-do-not-send="true">https://vk.com/araneawebcorpora</a> </p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
CWB mailing list
<a class="moz-txt-link-abbreviated" href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a>
<a class="moz-txt-link-freetext" href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb">http://liste.sslmit.unibo.it/mailman/listinfo/cwb</a>
</pre>
</blockquote>
<p><br>
</p>
</body>
</html>