[CWB] Tabulate and <s>

Stefan Evert stefanML at collocations.de
Thu May 7 20:15:09 CEST 2015


> On 7 May 2015, at 18:47, Maarten Janssen <maartenpt at gmail.com> wrote:
> 
> Dear all,
> 
> in many circumstances, tabulate is much more manageable than cat. However, I have seen no way to replicate this very useful feature in tabulate:
> 
> set Context s;
> 
> Does anyone know of any way to give a tabulate command that gives the data for an entire sentence containing match?


I recently posted the following recipe, which seems to be more or less what you're looking for (especially if you manage to solve the exercise for the reader):

> This sounds like you're searching for a single word.  In order to obtain sentence contexts, you can use the following trick:
> 
> 	set A target match;
> 	B = A expand to s;
> 	tabulate B match .. matchend word, target word, target lemma;
> 
> The result is a TAB-delimited file with three columns where the first column contains the complete sentence context, with individual tokens separated by blanks.  If you want separate left and right context:
> 
> 	tabulate B match .. target[-1] word, target[1] .. matchend word, target word, target lemma;
> 
> As Andrew has already pointed out, if you want a fixed number of context words, simply specify the desired offsets.  Again, you can put the entire context in a single column by specifying a range with .. (mnemonic for tabulate: every "," in the command gives you a TAB in the output).
> 
> 	tabulate A match[-5] .. match[-1] word, match[1] .. match[5], match word, match lemma;
> 
> Exercise for the reader: modify this command so it works with matches of variable length.
> 
> If you really wanted formatted kwic output as from "cat", simply "cat" the kwic lines into a text file, "tabulate" the other information into a second TAB-delimited file, then combine them with the "paste" program.

Cheers,
Stefan



More information about the CWB mailing list