[CWB] Issues when installing metadata restrictions

Hardie, Andrew a.hardie at lancaster.ac.uk
Tue Apr 5 16:17:54 CEST 2016


Hi Giorgina,

Category codes in classification fields can only contain letters and numbers. I see spaces in some of the values in your XML. EG " Monitoring and application".

Might that explain the problem?

best

Andrew.

-----Original Message-----
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Giorgina Cerutti Benitez
Sent: 05 April 2016 10:39
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] Issues when installing metadata restrictions

Hi Matt,

Yes, you're right. The thing is that I have also tested with other minimalist corpus that apparently are well built and I still had the same issues.

Thank you again.

Regards,

Giorgina

-----Message d'origine-----
De : cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] De la part de Timperley, Matt
Envoyé : mardi 5 avril 2016 11:14
À : Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Objet : Re: [CWB] Issues when installing metadata restrictions

Hi Giorgina,

Sorry if I'm mistaken about your issue but it looks to me like there is an angle bracket missing from the end of the first line. Just after lang="French". I think it should be: lang="French">.

I hope this helps,
Matt
________________________________________
From: cwb-bounces at sslmit.unibo.it [cwb-bounces at sslmit.unibo.it] on behalf of Giorgina Cerutti Benitez [Giorgina.Cerutti at unige.ch]
Sent: 05 April 2016 09:49
To: cwb at sslmit.unibo.it
Subject: [CWB] Issues when installing metadata restrictions

Hello everyone,

I am writing to you because we are having issues when installing our metadata classifications. We are currently testing metadata installation with corpus T39 (figure 1). Even though we manage to specify our s-attributes (see figure 2), only three of them are recognized as classifications when installing the metadata from the embedded XML (see figure 3); and in other tests none of them is recognized at all (see figure 4).

[cid:image010.jpg at 01D18E88.2B7242E0]
Figure 1:

<text id="test13" period="1" organization="un" category="Monitoring and application" genre="legislative" lang="French"
this
is
a
test
</text>
<text id="test25" period="2" organization="eu" category="Lawmaking" genre="monitoring" lang="Spanish"> this is also a test .
thas
worked
</text>
<text id="test26" period="3" organization="wto" category="Adjudication" genre="adjudication" lang="English"> thas thus shalala muajajaja </text>

[cid:image012.jpg at 01D18E88.2B7242E0]
Figure 2

[cid:image013.jpg at 01D18E88.2B7242E0]
Figure 3

[cid:image015.jpg at 01D18E88.2B7242E0]
Figure 4

We have then tried to install metadata by specifying the desired settings by hand (see figure 5), but we encounter an error (see figure 6).

[cid:image016.jpg at 01D18E88.2B7242E0]
Figure 5

[cid:image020.jpg at 01D18E88.2B7242E0]
Figure 6

The data source you specified for the text metadata contains badly-formatted text ID codes, as follows: <strong> '.'; '</text>'; '<text id="test13" period="1" organization="un" category="Monitoring and application" genre="legislative" lang="French"'; '<text id="test25" period="2" organization="eu" category="Lawmaking" genre="monitoring" lang="Spanish">'; '<text id="test26" period="3" organization="wto" category="Adjudication" genre="adjudication" lang="English">';</strong> (text ids can only contain unaccented letters, numbers, and underscore).

Since we cannot identify the error, we were wondering if any of you has had the same problem (I couldn't find any thread or information in the manual about this). I would also be grateful if you could tell us if this is a bug or if the system only accepts three classifications.

Thank you very much.

Regards,


Giorgina Cerutti
Assistant
Department of Translation - Spanish Unit Faculty of Translation and Interpreting University of Geneva Office 6242 - Uni Mail
40 bd du Pont d'Arve
CH-1211 Genève 4
[cid:image007.png at 01D1127F.0F2785D0]<https://www.linkedin.com/pub/giorgina-cerutti/20/337/7a0/en>[Facebook]<https://www.facebook.com/UNES.FTI.UNIGE>[Twitter]<https://twitter.com/giorginacerutti>[Transius_EN]<http://transius.unige.ch/en/>


More information about the CWB mailing list