[CWB] problems encoding xml header data

Stefan Evert stefanML at collocations.de
Mon May 5 14:23:44 CEST 2014


On 5 May 2014, at 11:45, Gertrud Faaß <faassg at uni-hildesheim.de> wrote:

> A typical entry looks as follows: <beitrag jahr="9999" land="USA C1USA" datum="99999999" ressort="Reise" ausgabe="99" seite="99" stichwort="Lufthansa/Streik" titel="Flug LH 99: gestrichen" unternehmen="" kateg1="Urlaub" quelle="zeitung 99/9999 vom 99.99.9999" untertitel="" kateg2="" autor="nachname, vorname"/>

Note that this is not a valid _open_ tag, but an empty XML element (< ... />).

CWB doesn't support empty elements, only regular open/close tags that are transformed into structural annotations.  I'm not sure how strict the XML tag parser is and whether it would throw a syntax error in this case.

Hope this helps,
Stefan



More information about the CWB mailing list