[CWB] Re: Querying more than one corpus at a time

Gertrud Faasz faaszgd at ims.uni-stuttgart.de
Wed Nov 24 13:31:13 CET 2010


Dear Alberto Simões,

maybe my approach is too simple, I want to stress that I should like to
learn from other users, too.

For myself, I make use of the perl interface in the following way:

1. I write the respective query into a macro
2. I run a perl program using the macro on all the corpora one after the
other (these are defined in an array, i.e. a sorted list),  query
results are written into an appropriate structure, e.g. for frequency
data the tool can create a Hash (unsorted list with keys = word/lemma
and  values=frequency).
3. As the perl knows which corpus is queried at the time, this
information can be stored with the data, too (e.g. Hash key1= corpus,
key2=word/lemm, value=frequency)
4. After all is processed, the stored data is written into a file or
printed to the screen or furtherly processed by other tools.

If you like to do it the same way, just let me know, I can provide such
a basic perl program and an example macro. It is simple to use, also for
non-programmers, but surely cannot manage all kinds of queries: The way
the data is to be stored depends on the kinds of queries you want to
process.

N.B. perl processors and the CWB perl interface must be installed on
your  computer first.

Kind regards
Gertrud



More information about the CWB mailing list