[CWB] Re Search list

Trklja, Alex A.Trklja at exeter.ac.uk
Fri Mar 27 17:03:36 CET 2015


Hi Stefan, 

thank you very much for your suggestions. I tried to combine grep with 'cwb-scan-corpus' but it didn't work either. I'll try your suggestion with Perl.

Alex
________________________________________
From: cwb-bounces at sslmit.unibo.it [cwb-bounces at sslmit.unibo.it] on behalf of cwb-request at sslmit.unibo.it [cwb-request at sslmit.unibo.it]
Sent: 27 March 2015 11:00
To: cwb at sslmit.unibo.it
Subject: CWB Digest, Vol 98, Issue 24

Send CWB mailing list submissions to
        cwb at sslmit.unibo.it

To subscribe or unsubscribe via the World Wide Web, visit
        http://devel.sslmit.unibo.it/mailman/listinfo/cwb
or, via email, send a message with subject or body 'help' to
        cwb-request at sslmit.unibo.it

You can reach the person managing the list at
        cwb-owner at sslmit.unibo.it

When replying, please edit your Subject line so it is more specific
than "Re: Contents of CWB digest..."


Today's Topics:

   1. Search list (Trklja, Alex)
   2. Re: Search list (Stefan Evert)


----------------------------------------------------------------------

Message: 1
Date: Fri, 27 Mar 2015 07:44:28 +0000
From: "Trklja, Alex" <A.Trklja at exeter.ac.uk>
To: "cwb at sslmit.unibo.it" <cwb at sslmit.unibo.it>
Subject: [CWB] Search list
Message-ID:
        <84E19F33877B2840BBF4AFF4265B21BB3E4B01A5 at VMEXCHANGEMBS6A.isad.isadroot.ex.ac.uk>

Content-Type: text/plain; charset="us-ascii"

Dear all,

I would like to search several multi-word expressions in my corpus and I was wondering if it is possible to create a search list directly in CWB or to run a Grep command within CWB? Thanks!

This is an example of my search list:
in that regard
in any event
in those circumstances
the fact that
as regards the
on the contrary

Best
Alex

------------------------------

Message: 2
Date: Fri, 27 Mar 2015 09:18:36 +0100
From: Stefan Evert <stefanML at collocations.de>
To: CWBdev Mailing List <cwb at sslmit.unibo.it>
Subject: Re: [CWB] Search list
Message-ID: <132E2EAA-A7B6-45A4-A6D6-770A2680D21B at collocations.de>
Content-Type: text/plain; charset=utf-8

> I would like to search several multi-word expressions in my corpus and I was wondering if it is possible to create a search list directly in CWB or to run a Grep command within CWB? Thanks!

You can't read a list of multiword expressions into CQP and search for them ? that's only possible with lists of single words.

For a given list, you can design a CQP query that finds all expressions in the list, e.g.

        "in"%c "that"%c "regard"%c | "in"%c "any"%c "event"%c | "in"%c "those"%c "circumstances"%c | "the"%c "fact"%c "that"%c | "as"%c "regards"%c "the"%c | "on"%c "the"%c "contrary"%c;

This probably won't work if you have a list of several hundred multiwords, though I don't know the precise limits off the top of my head.  There might be tighter limits if you enter this in an interactive session (with -e flag) because the entire input line is read into an editing buffer first.

You can speed this up a little and push the limits if you combine expressions with the same prefix, in this case

        "in"%c ("that"%c "regard"%c | "any"%c "event"%c | "those"%c "circumstances"%c) | "the"%c "fact"%c "that"%c | "as"%c "regards"%c "the"%c | "on"%c "the"%c "contrary"%c;

I'd recommend that you generate such queries in Perl or some other high-level language.

Hope this helps,
Stefan



------------------------------

_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb


End of CWB Digest, Vol 98, Issue 24
***********************************


More information about the CWB mailing list