[CWB] Suggestion: user intervention in constructing an index

Hardie, Andrew a.hardie at lancaster.ac.uk
Thu Mar 29 17:22:31 CEST 2018


Well, we’ll have to agree to disagree then. Speaking only for myself of course, I am against adding the behaviour you request. As I noetd, you can always wrap CQP: part of the advantage of a nice simple model at the engine level is that remains easy to embed it in something more particularised. If you really insist on having this behaviour at the CQP level, you do have the option of patching the code for your own use…

best

Andrew.

From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Ciarán Ó Duibhín
Sent: 29 March 2018 14:31
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Subject: Re: [CWB] Suggestion: user intervention in constructing an index

Andrew said

the underlying engine is appropriately neutral about the semantics of any attribute name… so specifying a specific s-attribute as meaning “glue” would not be something ever to build in at the system level of CQP. Front ends can of course impose whatever requirements about attribute semantics that they like.

Thank you again, Andrew, but I remain of the view that a vertical file should provide some means of marking where a space does not belong between tokens in contexts. It could be a "+", as I used in an earlier post; an XML-like glue tag (s-attribute), as used in Manatee/Bonito; an extra binary p-attribute (or better, two such attributes).  Whatever it is, the concordance output should recognize it.  I don't see this as semantic interpretation, but just preserving the integrity of the original text.  An XML-like tag is arguably the worst choice, but I am inclined to go with whatever already works.

Regards,
Ciarán

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180329/1325075e/attachment.html>


More information about the CWB mailing list