[CWB] [ cwb-Bugs-3058717 ] cl_string_canonical: risk of buffer
overflow
SourceForge.net
noreply at sourceforge.net
Fri Sep 3 13:01:12 CEST 2010
Bugs item #3058717, was opened at 2010-09-03 11:00
Message generated for change (Settings changed) made by andrewhardie
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=722303&aid=3058717&group_id=131809
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
>Priority: 7
Private: No
Submitted By: Andrew Hardie (andrewhardie)
>Assigned to: Andrew Hardie (andrewhardie)
Summary: cl_string_canonical: risk of buffer overflow
Initial Comment:
cl_string_canonical currently modifies strings in situ. It will be more convenient for it to always return a newly allocated string unless specifically instructed.
char *
cl_string_canonical(char *s, CorpusCharset charset, int flags, size_t inplace_bufsize)
If inplace_bufsize == 0 (or negative), a newly allocated string is returned.
If inplace_bufsize > 0, s is modified in-place up to a maximum size of inplace_bufsize-1 characters (plus NUL terminator). If the normalised string doesn't fit into the buffer, the extra characters are dropped silently. For UTF-8 strings, the result allocated by Glib is copied to s (dropping characters that don't fit) and then free'd, as in the current implementation.
This will break backwards compartibiltiy of the CL.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=722303&aid=3058717&group_id=131809
More information about the CWB
mailing list