[CWB] Metadata format
Graham Ranger -- UAPV
graham.ranger at univ-avignon.fr
Thu Mar 21 11:13:11 CET 2019
Hello to all,
I would like some help formatting metadata for a corpus.
I understand that the "text id" field has to use only ASCII alphnumeric
characters plus de underscore. However, from my experiments, this
constraint appears to apply to all fields.
And so, while the metadata for the BE 2006 corpus, on the cqpweb
interface at Lancaster, appears as "Press, Entire text, A. Press:
Reportage" I would only be able to display this sort of information as
"Press, Entire_text", "A_Press_Reportage", etc. I have played with the
"Free text" "Classification" opposition, but that makes no difference.
When my text is formated with spaces, or punctuation, it simply does not
show in the metadata.
I'm doing this with a separate text file including metadata, but the
other possibility, i.e. including metadata as attributes inside the
corpus xml has not proved any more satisfactory.
Many thanks in advance for any help.
Best,
Graham.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20190321/4f1f1806/attachment.html>
More information about the CWB
mailing list