[CWB] reserved words in CWB

Stefan Evert stefanML at collocations.de
Fri Sep 11 15:07:40 CEST 2015


> 
> On 10 Sep 2015, at 20:26, Ruprecht von Waldenfels <ruprecht.waldenfels at gmx.net> wrote:
> 
> in my setup, I unfortunately use the word "by" as the name of an attribute (similar to "word", "lemma", etc.). Unfortunately, if I make a query
> [lemma="xwz" & by contains "ysd"]
> I get an error message, because by is a reserved word.
> 
> The use of "by" (it stands for "Belarusian")  follows automatically from the architecture of my corpus, and I would have to change it in a lot of places. Is there any way I can continue using it by escaping it somehow?

Unfortunately, there's no workaround.  Like in most programming languages, identifiers (such as attribute names) cannot be identical to reserved words.

The solution, of course, is to rename your attributes.  If you automatically generate attribute names, use a special prefix or suffix to make sure there are no collisions (hint: reserved words never contain underscores).

Have other people had similar issues?  If there's popular demand, we might be able to implement "escapes" for attribute names in the final 3.5 release.  We'd probably use R-style notation with backticks, i.e. in your example

	[lemma="xwz" & `by` contains "ysd"]

(The ` operator is defined in the CQP grammar, but currently unused.)

Best,
Stefan



More information about the CWB mailing list