[CWB] Metadata format

Graham Ranger -- UAPV graham.ranger at univ-avignon.fr
Thu Mar 21 11:13:11 CET 2019


Hello to all,
I would like some help formatting metadata for a corpus.
I understand that the "text id" field has to use only ASCII alphnumeric 
characters plus de underscore. However, from my experiments, this 
constraint appears to apply to all fields.
And so, while the metadata for the BE 2006 corpus, on the cqpweb 
interface at Lancaster, appears as "Press, Entire text, A. Press: 
Reportage" I would only be able to display this sort of information as 
"Press, Entire_text", "A_Press_Reportage", etc. I have played with the 
"Free text" "Classification" opposition, but that makes no difference. 
When my text is formated with spaces, or punctuation, it simply does not 
show in the metadata.
I'm doing this with a separate text file including metadata, but the 
other possibility, i.e. including metadata as attributes inside the 
corpus xml has not proved any more satisfactory.
Many thanks in advance for any help.
Best,
Graham.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20190321/4f1f1806/attachment.html>


More information about the CWB mailing list