[CWB] Regex/backreferencing

Hardie, Andrew a.hardie at lancaster.ac.uk
Mon Mar 13 10:50:58 CET 2017


OK, then upgrading will probably fix it. If it doesn’t, maybe try using double-backslash?? – Andrew.

From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Susanne Flach
Sent: 13 March 2017 09:49
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] Regex/backreferencing

Hi Andrew,

Ah, ok, that might be it — we’re using 3.0 at the moment. I’ll test this.

Thanks!
Susanne


On 13 Mar 2017, at 10:45, Hardie, Andrew <a.hardie at lancaster.ac.uk<mailto:a.hardie at lancaster.ac.uk>> wrote:

The query

[word="(.+)-?\1"]

works as expected for me -- both via the CQPweb interface, and on the command line: IE it finds words consisting of the same element twice, with or without intervening hyphen.

What version of CWB are you using? If it’s a version that predates the use of PCRE as the regex engine, that could explain this…

best

Andrew

From: cwb-bounces at sslmit.unibo.it<mailto:cwb-bounces at sslmit.unibo.it> [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Susanne Flach
Sent: 13 March 2017 09:10
To: Open source development of the Corpus WorkBench
Subject: [CWB] Regex/backreferencing

Dear CWBists,

Can you backreference on the token level in CQP?

This question has been nagging me from time to time; now a student wants to investigate reduplication. For querying across token boundaries, labels sort of do the trick except they don’t seem to ignore case (i.e. a:[] b:[] :: a.word = b.word), but for reduplication within a token the pattern [word="(.+)-?\1”] only returns strings ending in 1.

Is this possible in CQP? Plus, is there a way to ignore case in labels?

Any ideas or pointers on (advanced) functions would be much appreciated.

Best & thanks
Susanne


_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
http://liste.sslmit.unibo.it/mailman/listinfo/cwb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20170313/260e0409/attachment.html>


More information about the CWB mailing list