<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hello to all,<br>
I would like some help formatting metadata for a corpus.<br>
I understand that the "text id" field has to use only ASCII
alphnumeric characters plus de underscore. However, from my
experiments, this constraint appears to apply to all fields.<br>
And so, while the metadata for the BE 2006 corpus, on the cqpweb
interface at Lancaster, appears as "Press, Entire text, A. Press:
Reportage" I would only be able to display this sort of information
as "Press, Entire_text", "A_Press_Reportage", etc. I have played
with the "Free text" "Classification" opposition, but that makes no
difference. When my text is formated with spaces, or punctuation, it
simply does not show in the metadata.<br>
I'm doing this with a separate text file including metadata, but the
other possibility, i.e. including metadata as attributes inside the
corpus xml has not proved any more satisfactory.<br>
Many thanks in advance for any help.<br>
Best,<br>
Graham.<br>
</body>
</html>