[CWB] CQP: shared collocates
Sylvain Loiseau
sylvain.loiseau at wanadoo.fr
Sun Oct 7 21:38:01 CEST 2012
Hi,
You can use rcqp in order to deal with such filtering of form lists. For instance :
> c <- corpus("DICKENS")
>
>
> # nearly :
>
> nearly <- subcorpus(c, '@[] "nearly"')
> # extract the frequency list of the collocate :
> fl.nearly <- cqp_flist(nearly, "target", "word")
> # fl.nearly is a named vector : the name of the
> # element is the form, the value its frequency
> # in the subcorpus
>
>
> # almost :
>
> almost <- subcorpus(c, '@[] "almost"')
> fl.almost <- cqp_flist(almost, "target", "word")
>
> intersect(names(fl.nearly), names(fl.almost))
[1] "Their" "n't" "that" "," "Servant"
[6] "with" "There" "as" "the" "both"
[11] "shall" "doubt" "Marley" "on" "no"
[16] "simile" "of" "in" "long" ";"
[21] "have" "each" "to" "my" "sole"
[26] "I" "." "it" "It" "But"
[31] "Scrooge" "readers" "him" "lips" "Sometimes"
[36] "a" "relate" "these" "'s" "years"
[41] "he" "?" "door-nail" "coffin-nail" "been"
[46] "would" "'" "December" "lay" "You"
[51] "his" "said" "felt" "merry" "me"
[56] "be"
You can also select as collocates all the tokens found in a given span around "already" and "almost", using the left.context and right.context option of the the cqp_flist function:
> fl.nearly <- cqp_flist(nearly, "target", "word", left.context=3, right.context=0)
Best,
Sylvain
Le 7 oct. 2012 à 15:00, Aleksandar Trklja a écrit :
> Hi Martí,
>
> thank you for your reply. I'm sorry my question was unclear.
>
> What I mean with 'shared collocates' are the collocates that occur both with a word x and a word y. Say I want to find the collocates that 'almost' and 'nearly' share. The '|' function will show the collocates that occur with either of the two but not with both (e.g. 'almost' occurs with 'certainly' but not with 'nearly'). So I guess I'd need here something like an 'AND' function instead of 'OR'.
>
> Cheers
> Alex
>
>
>
> ________________________________________
> From: cwb-bounces at sslmit.unibo.it [cwb-bounces at sslmit.unibo.it] on behalf of Martí Quixal [marti.quixal at gmail.com]
> Sent: 07 October 2012 13:01
> To: cwb at sslmit.unibo.it
> Subject: Re: [CWB] CQP: shared collocates (Aleksandar Trklja)
>
> Hi Aleksander,
>
> I don't know if I understand your question, but do you mean this?
>
> DICKENS> ".*" "year|people";
> 2077 matches. Use 'cat' to show.
> DICKENS> group Last match lemma;
> #---------------------------------------------------------------------
> (none) the 355
> of 168
> a 150
> other 100
> some 65
> (...)
>
> For more on group check this:
> http://cwb.sourceforge.net/files/CQP_Tutorial/node20.html
>
> Best
> mq
> On Sun, Oct 7, 2012 at 5:00 AM, <cwb-request at sslmit.unibo.it<mailto:cwb-request at sslmit.unibo.it>> wrote:
> Send CWB mailing list submissions to
> cwb at sslmit.unibo.it<mailto:cwb at sslmit.unibo.it>
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
> or, via email, send a message with subject or body 'help' to
> cwb-request at sslmit.unibo.it<mailto:cwb-request at sslmit.unibo.it>
>
> You can reach the person managing the list at
> cwb-owner at sslmit.unibo.it<mailto:cwb-owner at sslmit.unibo.it>
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of CWB digest..."
>
>
> Today's Topics:
>
> 1. CQP: shared collocates (Aleksandar Trklja)
> 2. Installing CQPWeb (Mart? Quixal)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 6 Oct 2012 09:12:46 +0000
> From: Aleksandar Trklja <AXT899 at bham.ac.uk<mailto:AXT899 at bham.ac.uk>>
> To: Open source development of the Corpus WorkBench
> <cwb at sslmit.unibo.it<mailto:cwb at sslmit.unibo.it>>
> Subject: [CWB] CQP: shared collocates
> Message-ID:
> <A584B1A99417C443AD247357ADAEEE050A863A80 at mbx01.adf.bham.ac.uk<mailto:A584B1A99417C443AD247357ADAEEE050A863A80 at mbx01.adf.bham.ac.uk>>
> Content-Type: text/plain; charset="us-ascii"
>
> Dear all,
>
> is it possible to produce with CQP a list that contains only shared collocates of two or more lexical items?
>
> Many thanks for your help.
>
> Best
> Alex
>
>
>
> ------------------------------
>
> Message: 2
> Date: Sat, 6 Oct 2012 21:48:22 -0500
> From: Mart? Quixal <marti.quixal at gmail.com<mailto:marti.quixal at gmail.com>>
> To: cwb at sslmit.unibo.it<mailto:cwb at sslmit.unibo.it>
> Subject: [CWB] Installing CQPWeb
> Message-ID:
> <CAMtTwm8Nb5H4eHvtRPfvf+6qc+QSZSqMzRM+5++RgvsSToGg9w at mail.gmail.com<mailto:CAMtTwm8Nb5H4eHvtRPfvf%2B6qc%2BQSZSqMzRM%2B5%2B%2BRgvsSToGg9w at mail.gmail.com>>
> Content-Type: text/plain; charset="utf-8"
>
> Dear list members,
>
> I am installing CQPWeb, and everything seemed to be ok until I typed the
> url in my browser:
>
> http://localhost/spintx-web/adm
>
> Then I got this message:
>
> CQPweb encountered an error and could not continue.
> You do not have permission to use this program.
>
> And then I looked into the apache error log and saw this other info:
>
> [Sat Oct 06 21:34:32 2012] [error] [client ::1] File does not exist:
> /Library/WebServer/Documents/spintx-web/css/CQPweb.css, referer:
> http://localhost/spintx-web/adm/
>
> My questions are:
>
> 1) Could this missing css file be causing the problem? (sounds weird...)
> so...
>
> 2) I used the automatic php configuration file and when the script asked
> for a user I gave a user that did not exist as a system user (I mistyped
> it, spintex-web instead of spintx-web). I added manually the system user I
> wanted to use, just in case, but this does not seem to improve anything.
> Then I had the impression that the user created with the automatic php
> config file is only a CQPWeb admin not a system user, am I wrong?
>
> So, do you have any recommendation? What else should I be looking to?
>
> I am running the whole thing (apache2, mysql, php, cwb tools, etc.) in the
> latest version (also CQPWeb from svn, not download link) on a Mac OSX
> 10.7.5.
>
> - PHP 5.3.15 with Suhosin-Patch (cli) (built: Jul 31 2012 14:49:18)
> - Server version: 5.5.28 MySQL Community Server (GPL)
> - Server version: Apache/2.2.22 (Unix)
> -- Server built: Jul 12 2012 15:11:26
> - CQP Version: 3.0.0
> (and latest versions of all perl modules as on the sourceforge page, except
> for CQPWeb, which is the svn version as recommended)
>
> Thanks in advance!
>
> --
> Mart? Quixal
> Computational Linguist & Educational Technologist
> http://www.iqubo.org/quixal
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121006/93b3087f/attachment-0001.html>
>
> ------------------------------
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
> End of CWB Digest, Vol 70, Issue 6
> **********************************
>
>
>
> --
> Martí Quixal
> Computational Linguist & Educational Technologist
> http://www.iqubo.org/quixal
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list