[CWB] [cwb:bugs] #61 CQP kwic formatting cuts off long lines
andrewhardie at users.sf.net
Mon Jun 16 01:36:52 CEST 2014
- **status**: open --> closed-fixed
** [bugs:#61] CQP kwic formatting cuts off long lines**
**Created:** Mon Jun 09, 2014 02:46 PM UTC by Stefan Evert
**Last Updated:** Mon Jun 09, 2014 02:46 PM UTC
**Owner:** Stefan Evert
Kwic formatting in CQP ("cat" command) cuts off long lines after MAXKWICLEN bytes on each side of the match. This typically happens when context is set to a large text region (which may also have been created unintentionally through a markup/annotation error). CQP used to segfault in this cases, which has recently been changed to the more benign truncation behaviour. The current value of MAXKWICLINELEN is 64 KiB.
This issue will eventually be fixed by the long-awaited overhaul of the kwic formatting code. In the meantime, problems that are still encountered by some users could be alleviated with the following patch:
- don't allocate char line[MAXKWICLINELEN + 1]; and char token[MAXKWICLINELEN + 1]; on the stack; make them dynamically allocated global buffers instead (compose_kwic_line() isn't reentrant anyway)
- replace the #define MAXKWICLINELEN by a CQP option that can be controlled with "set kwiclinelen ..."; when the option is set, the global buffers are automatically reallocated
- the option is initialised with MAXKWICLINELEN, and inital buffers of this size are allocated during startup
Power users will then be able to increase the kwic line length to an almost unlimited size.
Sent from sourceforge.net because cwb at sslmit.unibo.it is subscribed to https://sourceforge.net/p/cwb/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/cwb/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CWB