[CWB] Assigning cqp queries

Trklja, Alex A.Trklja at exeter.ac.uk
Tue Dec 15 13:29:27 CET 2015


Hi Stefan, 

My bad, sorry. I did mean concatenation of the query results. My example was just for illustrative purposes. True, for such examples individual queries can easily be run separately. I was thinking about more general implementation - for example to save a range of grammar patterns, rename them in terms of functional categories and then possibly combine.

It seems I should be able to do what I want to with macros and annotation of query results. Thanks for your suggestion! I saw Richard's macro for passive verbs on the CWB wiki and I'll try to write something similar for my queries. 

Btw, do you have any idea why am I getting an error when I try to load macro definitions?

CQP Error:
        MACRO syntax error (file 'macros.txt', line 2)
		
This is for the example from the CQP Tutorial:

# this is a comment and will be ignored
MACRO np(0)
  [pos = "DT"]    # another comment
  ([pos = "RB.*"]? [pos = "JJ.*"])*
  [pos = "NNS?"]
;

I've tried other examples and the error always refers to the line with the name of the macro.

Thanks.

Cheers
Alex

-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of cwb-request at sslmit.unibo.it
Sent: 15 December 2015 12:00
To: cwb at sslmit.unibo.it
Subject: CWB Digest, Vol 107, Issue 10

Send CWB mailing list submissions to
	cwb at sslmit.unibo.it

To subscribe or unsubscribe via the World Wide Web, visit
	http://devel.sslmit.unibo.it/mailman/listinfo/cwb
or, via email, send a message with subject or body 'help' to
	cwb-request at sslmit.unibo.it

You can reach the person managing the list at
	cwb-owner at sslmit.unibo.it

When replying, please edit your Subject line so it is more specific than "Re: Contents of CWB digest..."


Today's Topics:

   1. Re: Assigning cqp queries (Trklja, Alex)
   2. Re: Assigning cqp queries (Stefan Evert)


----------------------------------------------------------------------

Message: 1
Date: Tue, 15 Dec 2015 07:06:33 +0000
From: "Trklja, Alex" <A.Trklja at exeter.ac.uk>
To: "'cwb at sslmit.unibo.it'" <cwb at sslmit.unibo.it>
Subject: Re: [CWB] Assigning cqp queries
Message-ID:
	<AM2PR03MB07558631811C78E67960409EB8EE0 at AM2PR03MB0755.eurprd03.prod.outlook.com>
	
Content-Type: text/plain; charset="us-ascii"

Thanks for your suggestions and clarification. 

Yannick's suggestion works well for me. 

Hannah, set operations do not really combine queries the way I'd like it - at least not in Cygwin. For Union the result is an empty set, for Intersection 'count by pos' shows the queries separately: 
22045   JJ NN  [#0-#22044]
1557    MD VV  [#22045-#23601]

and for Difference either JJ NN or MD VV.


Best
Alex


-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of cwb-request at sslmit.unibo.it
Sent: 14 December 2015 16:37
To: cwb at sslmit.unibo.it
Subject: CWB Digest, Vol 107, Issue 9

Send CWB mailing list submissions to
	cwb at sslmit.unibo.it

To subscribe or unsubscribe via the World Wide Web, visit
	http://devel.sslmit.unibo.it/mailman/listinfo/cwb
or, via email, send a message with subject or body 'help' to
	cwb-request at sslmit.unibo.it

You can reach the person managing the list at
	cwb-owner at sslmit.unibo.it

When replying, please edit your Subject line so it is more specific than "Re: Contents of CWB digest..."


Today's Topics:

   1. Assigning cqp queries (Trklja, Alex)
   2. Re: Assigning cqp queries (Yannick Versley)
   3. Re: Assigning cqp queries (Hannah Kermes)
   4. Re: Assigning cqp queries (Hannah Kermes)
   5. Re: Assigning cqp queries (Hardie, Andrew)


----------------------------------------------------------------------

Message: 1
Date: Mon, 14 Dec 2015 11:30:02 +0000
From: "Trklja, Alex" <A.Trklja at exeter.ac.uk>
To: "'cwb at sslmit.unibo.it'" <cwb at sslmit.unibo.it>
Subject: [CWB] Assigning cqp queries
Message-ID:
	<AM2PR03MB0755BF8936FE07795A7EE705B8ED0 at AM2PR03MB0755.eurprd03.prod.outlook.com>
	
Content-Type: text/plain; charset="us-ascii"

Dear all, 

Is it possible in CQP to assign cqp queries to new variables and then re-use them or have you thought about including something along this line to Ziggurat/CWB4? Say, I have the following queries:

A= [pos='JJ'] [pos='NN']
B= [pos='MD] [pos='VV']
C=[pos='RB'] [pos='VV']

And I would like to combine them in the following two ways:
D=A+B
E=A+C

Thanks.
Alex



------------------------------

Message: 2
Date: Mon, 14 Dec 2015 13:35:02 +0100
From: Yannick Versley <yversley at gmail.com>
To: Open source development of the Corpus WorkBench
	<cwb at sslmit.unibo.it>
Subject: Re: [CWB] Assigning cqp queries
Message-ID:
	<CAHXjEOYCutaPWQFdVeCSS9TW2sveau8xhvHCk48dR1HH_iVx_A at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Alex,

if you want to do that with the current CQP, the method I'd use (short of using the API) would be to dump A, B and C to files (CQP writes a list of starting/ending
offsets)
and then concatenate/sort these files and read them back in.
as in
A=[pos='JJ'][pos='NN']
Dump A > 'a.txt'
B=[pos='MD'][pos='VV']
Dump B > 'b.txt'
cat a.txt b.txt | sort > ab.txt
Undump D < 'ab.txt'

Best,
Yannick

On Mon, Dec 14, 2015 at 12:30 PM, Trklja, Alex <A.Trklja at exeter.ac.uk>
wrote:

> Dear all,
>
> Is it possible in CQP to assign cqp queries to new variables and then 
> re-use them or have you thought about including something along this 
> line to Ziggurat/CWB4? Say, I have the following queries:
>
> A= [pos='JJ'] [pos='NN']
> B= [pos='MD] [pos='VV']
> C=[pos='RB'] [pos='VV']
>
> And I would like to combine them in the following two ways:
> D=A+B
> E=A+C
>
> Thanks.
> Alex
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20151214/f9e311e8/attachment-0001.html>

------------------------------

Message: 3
Date: Mon, 14 Dec 2015 13:58:47 +0100
From: Hannah Kermes <h.kermes at mx.uni-saarland.de>
To: Open source development of the Corpus WorkBench
	<cwb at sslmit.unibo.it>
Subject: Re: [CWB] Assigning cqp queries
Message-ID: <566EBD07.7010000 at mx.uni-saarland.de>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

Hi Alex,

if this hasn't changed in the newest version, you can combine named queries in the following way:
A = intersection B C;   A = B ? C
A = union B C;             A = B ? C
A = difference B C;      A = B \ C

Best
Hannah

Am 14.12.2015 um 13:35 schrieb Yannick Versley:
> Hi Alex,
>
> if you want to do that with the current CQP, the method I'd use (short 
> of using the API) would be to dump A, B and C to files (CQP writes a 
> list of starting/ending offsets) and then concatenate/sort these files 
> and read them back in.
> as in
> A=[pos='JJ'][pos='NN']
> Dump A > 'a.txt'
> B=[pos='MD'][pos='VV']
> Dump B > 'b.txt'
> cat a.txt b.txt | sort > ab.txt
> Undump D < 'ab.txt'
>
> Best,
> Yannick
>
> On Mon, Dec 14, 2015 at 12:30 PM, Trklja, Alex <A.Trklja at exeter.ac.uk 
> <mailto:A.Trklja at exeter.ac.uk>> wrote:
>
>     Dear all,
>
>     Is it possible in CQP to assign cqp queries to new variables and
>     then re-use them or have you thought about including something
>     along this line to Ziggurat/CWB4? Say, I have the following queries:
>
>     A= [pos='JJ'] [pos='NN']
>     B= [pos='MD] [pos='VV']
>     C=[pos='RB'] [pos='VV']
>
>     And I would like to combine them in the following two ways:
>     D=A+B
>     E=A+C
>
>     Thanks.
>     Alex
>
>     _______________________________________________
>     CWB mailing list
>     CWB at sslmit.unibo.it <mailto:CWB at sslmit.unibo.it>
>     http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20151214/33ce4152/attachment-0001.html>

------------------------------

Message: 4
Date: Mon, 14 Dec 2015 15:16:22 +0100
From: Hannah Kermes <h.kermes at mx.uni-saarland.de>
To: Open source development of the Corpus WorkBench
	<cwb at sslmit.unibo.it>
Subject: Re: [CWB] Assigning cqp queries
Message-ID: <566ECF36.5040001 at mx.uni-saarland.de>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

Dear all,

a follow-up question to this. Is such a unification or intersection possible in CQPweb (e.g. with saved queries) or is this planned?

Best
Hannah

Am 14.12.2015 um 13:58 schrieb Hannah Kermes:
> Hi Alex,
>
> if this hasn't changed in the newest version, you can combine named 
> queries in the following way:
> A = intersection B C;   A = B ? C
> A = union B C;             A = B ? C
> A = difference B C;      A = B \ C
>
> Best
> Hannah
>
> Am 14.12.2015 um 13:35 schrieb Yannick Versley:
>> Hi Alex,
>>
>> if you want to do that with the current CQP, the method I'd use 
>> (short of using the API) would be to dump A, B and C to files (CQP 
>> writes a list of starting/ending offsets) and then concatenate/sort 
>> these files and read them back in.
>> as in
>> A=[pos='JJ'][pos='NN']
>> Dump A > 'a.txt'
>> B=[pos='MD'][pos='VV']
>> Dump B > 'b.txt'
>> cat a.txt b.txt | sort > ab.txt
>> Undump D < 'ab.txt'
>>
>> Best,
>> Yannick
>>
>> On Mon, Dec 14, 2015 at 12:30 PM, Trklja, Alex <A.Trklja at exeter.ac.uk 
>> <mailto:A.Trklja at exeter.ac.uk>> wrote:
>>
>>     Dear all,
>>
>>     Is it possible in CQP to assign cqp queries to new variables and
>>     then re-use them or have you thought about including something
>>     along this line to Ziggurat/CWB4? Say, I have the following queries:
>>
>>     A= [pos='JJ'] [pos='NN']
>>     B= [pos='MD] [pos='VV']
>>     C=[pos='RB'] [pos='VV']
>>
>>     And I would like to combine them in the following two ways:
>>     D=A+B
>>     E=A+C
>>
>>     Thanks.
>>     Alex
>>
>>     _______________________________________________
>>     CWB mailing list
>>     CWB at sslmit.unibo.it <mailto:CWB at sslmit.unibo.it>
>>     http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>>
>>
>>
>>
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20151214/8c96160c/attachment-0001.html>

------------------------------

Message: 5
Date: Mon, 14 Dec 2015 15:36:50 +0000
From: "Hardie, Andrew" <a.hardie at lancaster.ac.uk>
To: Open source development of the Corpus WorkBench
	<cwb at sslmit.unibo.it>
Subject: Re: [CWB] Assigning cqp queries
Message-ID:
	<28078EC3FBF1B940A3EF3D0D19BE351D7FABB019 at EX-1-MB2.lancs.local>
Content-Type: text/plain; charset="utf-8"

It?s not currently possible but it is on the ?features to implement? list.

Unfortunately that?s a very long list?.

Andrew

From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Hannah Kermes
Sent: 14 December 2015 14:16
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] Assigning cqp queries

Dear all,

a follow-up question to this. Is such a unification or intersection possible in CQPweb (e.g. with saved queries) or is this planned?

Best
Hannah
Am 14.12.2015 um 13:58 schrieb Hannah Kermes:
Hi Alex,

if this hasn't changed in the newest version, you can combine named queries in the following way:
A = intersection B C;   A = B ? C
A = union B C;             A = B ? C
A = difference B C;      A = B \ C

Best
Hannah
Am 14.12.2015 um 13:35 schrieb Yannick Versley:
Hi Alex,

if you want to do that with the current CQP, the method I'd use (short of using the API) would be to dump A, B and C to files (CQP writes a list of starting/ending offsets) and then concatenate/sort these files and read them back in.
as in
A=[pos='JJ'][pos='NN']
Dump A > 'a.txt'
B=[pos='MD'][pos='VV']
Dump B > 'b.txt'
cat a.txt b.txt | sort > ab.txt
Undump D < 'ab.txt'

Best,
Yannick

On Mon, Dec 14, 2015 at 12:30 PM, Trklja, Alex <A.Trklja at exeter.ac.uk<mailto:A.Trklja at exeter.ac.uk>> wrote:
Dear all,

Is it possible in CQP to assign cqp queries to new variables and then re-use them or have you thought about including something along this line to Ziggurat/CWB4? Say, I have the following queries:

A= [pos='JJ'] [pos='NN']
B= [pos='MD] [pos='VV']
C=[pos='RB'] [pos='VV']

And I would like to combine them in the following two ways:
D=A+B
E=A+C

Thanks.
Alex

_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
http://devel.sslmit.unibo.it/mailman/listinfo/cwb





_______________________________________________

CWB mailing list

CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>

http://devel.sslmit.unibo.it/mailman/listinfo/cwb





_______________________________________________

CWB mailing list

CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>

http://devel.sslmit.unibo.it/mailman/listinfo/cwb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20151214/ca91abea/attachment.html>

------------------------------

_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb


End of CWB Digest, Vol 107, Issue 9
***********************************


------------------------------

Message: 2
Date: Tue, 15 Dec 2015 08:55:01 +0100
From: Stefan Evert <stefanML at collocations.de>
To: CWBdev Mailing List <cwb at sslmit.unibo.it>
Subject: Re: [CWB] Assigning cqp queries
Message-ID: <B8065FDD-6B86-4552-A296-14E35025906A at collocations.de>
Content-Type: text/plain; charset=us-ascii


> On 15 Dec 2015, at 08:06, Trklja, Alex <A.Trklja at exeter.ac.uk> wrote:
> 
> Yannick's suggestion works well for me. 

But it does exactly the same thing as the "union" command in CQP!  So if Yannick's solution works for you, "union" will also work.

> Hannah, set operations do not really combine queries the way I'd like 
> it - at least not in Cygwin. For Union the result is an empty set,

I guess you mean "intersection" here?  Of course the intersection is empty, because the two query results are necessarily disjoint sets.

> for Intersection 'count by pos' shows the queries separately: 
> 22045   JJ NN  [#0-#22044]
> 1557    MD VV  [#22045-#23601]

Well, since the two individual queries produce different POS pattern, when you count by pos they will be listed separately.

Perhaps you're actually looking for a _concatenation_ of the query results and just put us on the wrong track with misleading notation?  So that what you call "A + B" would return patterns of the form JJ NN MD VV?

Then the answer is: no, that can't be done at the moment.  It would be relatively easy to implement in your special case, but I don't think people need exactly this operation often enough to make it worth adding to the query language.  A more general implementation quickly becomes fairly tricky.

Are you sure that you can't afford to run each combination query? Does it really take that long?  In this case, you'd save the individual queries as macros, which allows you to combine them in any way you want.

If you really need to make query _results_ reusable, the solution Hannah and I found was to annotate them as a new s-attribute in the corpus, which you can then use in new queries in the standard way.  This will actually be made a little easier in a Ziggurat-based implementation.

Best,
Stefan



------------------------------

_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb


End of CWB Digest, Vol 107, Issue 10
************************************


More information about the CWB mailing list