[CWB] Assigning cqp queries

Hardie, Andrew a.hardie at lancaster.ac.uk
Wed Dec 16 12:43:34 CET 2015


File reads across the board use fgets() with text files (or text mode pipes) so that the C library will translate CRLF to LF where necessary; i.e. we assume CQP will always be accessing LF files if it's running on Unix and CRLF files if it's running on Windows.

So, if *any* file input from a CRLF-file on Unix works, it does so purely because some accidental factor results in the unwanted CR being discarded.

Undump probably works because it reads by line and then uses this sscanf pattern:

 (sscanf(line, "%d %d %d %d %s", &match, &matchend, &target, &keyword, junk)

if there was a stray CR in there it would get absorbed by the final junk variable and make no difference to anything.

best

Andrew.

-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Stefan Evert
Sent: 16 December 2015 08:35
To: CWBdev Mailing List
Subject: Re: [CWB] Assigning cqp queries


> On 16 Dec 2015, at 08:54, Trklja, Alex <A.Trklja at exeter.ac.uk> wrote:
> 
> Yes, the problem was with line endings - dos2unix did the trick. Thank you so much for your help.

Hm, have you had problems with any other file inputs? E.g. "undump" or when reading a CQP script with cqp -f ?

If this is the only place where windows line breaks lead to problems, we should probably try to fix it.

Best,
Stefan
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it
http://devel.sslmit.unibo.it/mailman/listinfo/cwb


More information about the CWB mailing list