[CWB] Impossible query
Hardie, Andrew
a.hardie at lancaster.ac.uk
Sat Feb 27 06:18:27 CET 2016
>From what I know of Stack Overflow they might quite possibly reject this as off topic. Moreover, while I can't speak for Stefan or anyone else, of course, I personally have no intention of engaging with CWB-related questions anywhere but on this list.
ANYWAY: I suspect the best way to accomplish what you want is by tabulating (or dumping) the query and running a script across the tabulation/dump that implements the sub-selection that you describe. Then undump, then cat.
best
Andrew.
-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Ruprecht von Waldenfels
Sent: 26 February 2016 23:58
To: Open source development of the Corpus WorkBench
Subject: [CWB] Impossible query
Dear all,
I just posted the following question on StackOverflow, see
http://stackoverflow.com/questions/35663861/restrict-results-from-corpusworkbench-cwb-to-up-to-n-occurrences-of-an-attribu
Say, I have a corpus encoded in CWB with word, lemma, and aligned word
information, such as in
I I |Ich|
told tell |habe|gesagt|
them they |sie|
to to ||
leave leave |gehen|
Note that in the third column, alternate values are possible.
Now presuming I want a random sample of the occurrences of words with a
lemma starting with "l", I would go:
A=[lemma="l"]; reduce A to 1000; cat A;
This will give me a random sample with very different frequencies for
each lemma; e.g., the lemma "leave" might be contained 20 times.
Here comes my problem: (a) what can I do if I want the random sample to
contain a maximum number of 4 occurrences for each lemma? (b) what if I
want the random sample to contain a maximum of 4 occurrences of any
translation in column 3?
I suspect this is not possible in CWB, but I may be wrong; also, it may
be possible using a combination of R and CWB.
I would greatly appreciate any help; I posted it on StackOverflow,
because I thought this would be a better way to talk about for this kind
of question, but actually, the community I am addressing is presumably
on this list rather than on StackOverflow!
Best,
Ruprecht
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list