[CWB] CQP command to query by corpus position

Stephen Barrett Stephen.Barrett at glasgow.ac.uk
Thu Nov 23 13:09:45 CET 2017


Hi Stefan.

That's enormously helpful! I experimented with each of the solutions you suggested and found that the third option - writing to a temp file - is ideal for our purposes.

Many thanks indeed.

Stevie


-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Stefan Evert
Sent: 21 November 2017 16:13
To: CWBdev Mailing List <cwb at sslmit.unibo.it>
Subject: Re: [CWB] CQP command to query by corpus position


> Can anyone tell me whether there is a CQP command I can use to get a token at a specific corpus position? If I know from a table of CQP results, for example, that a given token has a cpos of 2103, is there a command to retrieve that token using that integer?

By far the best solution is to access the indexed data directly through the C-level API (e.g. with the CWB::CL Perl module) or the CQi network interface (if you have a lot of time to write your own client library ;-).

If you really need to do it via CQP, there are two possibilities:

a) You don't care about efficiency and wasting huge amounts of memory:

	[word = ".*" & _ = 2103];

b) You do care and are working interactively (cqp -e) in the terminal

	undump A;
	1
	2103	2103
	cat A;

(where you have to type a TAB character between the two 2103's).

c) You want to do the same programmatically

	1. write some temp file, e.g. /tmp/A.txt with the line "2103\t2103";

	2. in CQP:
		undump A < "/tmp/A.txt";
		cat A;

Hope this helps
Stefan
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://liste.sslmit.unibo.it/mailman/listinfo/cwb


More information about the CWB mailing list