[CWB] Suggestion: user intervention in constructing an index
Ruprecht von Waldenfels
ruprecht.waldenfels at gmx.net
Sat Mar 31 08:22:38 CEST 2018
Dear Vlado,
interesting, that explains why the spaces are still there in my corpus.
Where do I turn them off?
I think for those users that need to copy text it's really a needed
function.
Best!
Ruprecht
Am 30.03.2018 um 18:25 schrieb Vladimír Benko:
> Dear All,
>
>> Just a small rectification re: |<g/>| in Manatee/Bonito: turns out it
>> /is/ an opt-in configuration after all, cf.
>> https://groups.google.com/a/sketchengine.co.uk/d/msg/noske/lYHa3WSb4L8/6ycvtxCYAwAJ.
>> Sorry if I misled anyone earlier, I don’t use the feature myself, so
>> I only had a vague recollection it was somehow there. And apologies
>> to any (No)SkE devs who might be subscribed to the CWB list — this is
>> actually a nice and clean way to do it :)
>
> The <g/> feature in (No)SkE can be opted in at two levels: Firstly, by
> including or the <g/> structure into the source vertical (this must be
> performed during tokenization), and defining it in the respective
> corpus configuration file, the corpus designer decides that the
> original appearance of spaces is preserved. And secondly, any corpus
> user can decide whether the <g/> structures are to be interpreted
> (which is bit misleadingly called "displayed").
>
> In our (No)SkE installations, we prefer preserving information about
> spaces for the text displayed on the screen, as two main groups of our
> corpora (lexicographers and students of foreign languages) typically
> need to copy longer texts fragments, which otherwise would require
> manual editing.
>
> I admit, however, that use of <g/>'s may also confuse corpus users, as
> some token boundaries become "hidden" and tonenization policy is less
> apparent :-)
>
> Best regards,
>
> Vlado B, 18:20
>
>
> --
> Vladimír Benko
>
> Université Comenius de Bratislava
> Chaire UNESCO de communication
> plurilingue et multiculturelle
>
> Šafárikovo námestie 6, SK-81499 Bratislava
>
> http://unesco.uniba.sk/guest/
> https://www.facebook.com/araneawebcorpora/
> https://vk.com/araneawebcorpora
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180331/5c7b025b/attachment.html>
More information about the CWB
mailing list