[CWB] Tabulate and <s>
Stefan Evert
stefanML at collocations.de
Thu May 7 20:15:09 CEST 2015
> On 7 May 2015, at 18:47, Maarten Janssen <maartenpt at gmail.com> wrote:
>
> Dear all,
>
> in many circumstances, tabulate is much more manageable than cat. However, I have seen no way to replicate this very useful feature in tabulate:
>
> set Context s;
>
> Does anyone know of any way to give a tabulate command that gives the data for an entire sentence containing match?
I recently posted the following recipe, which seems to be more or less what you're looking for (especially if you manage to solve the exercise for the reader):
> This sounds like you're searching for a single word. In order to obtain sentence contexts, you can use the following trick:
>
> set A target match;
> B = A expand to s;
> tabulate B match .. matchend word, target word, target lemma;
>
> The result is a TAB-delimited file with three columns where the first column contains the complete sentence context, with individual tokens separated by blanks. If you want separate left and right context:
>
> tabulate B match .. target[-1] word, target[1] .. matchend word, target word, target lemma;
>
> As Andrew has already pointed out, if you want a fixed number of context words, simply specify the desired offsets. Again, you can put the entire context in a single column by specifying a range with .. (mnemonic for tabulate: every "," in the command gives you a TAB in the output).
>
> tabulate A match[-5] .. match[-1] word, match[1] .. match[5], match word, match lemma;
>
> Exercise for the reader: modify this command so it works with matches of variable length.
>
> If you really wanted formatted kwic output as from "cat", simply "cat" the kwic lines into a text file, "tabulate" the other information into a second TAB-delimited file, then combine them with the "paste" program.
Cheers,
Stefan
More information about the CWB
mailing list