[CWB] problem at managing corpus metadata

Andres Chandia andres at chandia.net
Sat Dec 28 18:51:05 CET 2013



yes, this is what I see in the registry:
##
## s-attributes
(structural markup)
##

# <name> ... </name>
STRUCTURE name

# <lang> ... </lang>
STRUCTURE lang

# <season> ...
</season>
STRUCTURE season

# <text id=".."> ...
</text>
# (no recursive embedding allowed)
STRUCTURE text
STRUCTURE
text_id              #
[annotations]

I didn't know if I should ad the "id" attribute
also....
anyway, I still get:


    
        
            My metadata is embedded in the XML of my corpus!
        
        
             						 
 						No
XML annotations found for this corpus. 						
        
    

thanks

On Sat, December 28, 2013 18:06, Hardie, Andrew wrote:
 
 That function reads the list of attributes straight from the registry
file. Can you please check in the registry file that the s-attributes really have been indexed
as per your expectations?

 
Thanks,

 
Andrew.

 

 
 
 Andres Chandia  wrote:
 
 

Ok, I managed using A, but for B this is what I get :


    
        
            My metadata is embedded in the XML of my corpus!
        
        
             
 No XML annotations
found for this corpus. 
        
    
<style type="text/css"> ->
</style>


Hi Andres,

What you are doing
wrong, it would seem, is re-using the source data file as the input file for the metadata.
This is not how it works.

There are two ways to
add metadata:

A. From a tab-delimited table file, with one text per line, and fields in columns.
This is the function you are using, and it is resulting in an error message because you
are feeding back in the original vertical file, which is not in the expected
format.
B. From XML in the original data. This is what you want to do, but in
order to do it you need to (a) have indexed all the attributes on the text element as
s-attributes, (b) use the function labelled “Create metadata table from corpus XML
annotations” – accessed via the button low  down on the screen labelled “My
metadata is embedded in the XML of my corpus!” – instead of the standard function.


Hope that
clarifies.

best

Andrew.

From: Andres
Chandia [mailto:andres at chandia.net] 
 Sent: 27 December 2013 14:45

To: Hardie, Andrew
 Cc: Open source development of the
Corpus WorkBench
 Subject: problem at managing corpus
metadata
 
Hi there, so long...
 
 I have indexed a test corpus and now
I'm trying to add some metadata to it but I always got error messages:
 
 this is
what you can find at corpus:
 
 
 Almosnino
almosnino NCMS000
 , , Fc
 Moshe
moshe NCFS000
 . . Fp
 . .
Fp
 
 
 Regimiento
regimiento NCMS000
 de de SPS00
 la el DA0FS0
 vida vida NCFS000
 . . Fp
 . . Fp
 
 
 Salónica salónica NCFS000
 1564 1564 Z
 Transcription transcription
NCFS000
 . . Fp
 . .
Fp
 
 
 
 I
add an image of the settings that I use to install metadata
 
 and this is what I
always got:
 
 CQPweb encountered an error and could not
continue. The data source you specified for the text metadata contains badly-formatted text ID
codes, as follows: ','; '.'; ''; ''; ''; ''; ''; ' CQPweb v3.0.7 © 2008-2012 Corpus and
tagset help You  are logged in as user [admin] 
 
 but as you can see at the
corpus above none of the metadatas contains what the error message says....
 
 well,
I don't know what I'm doing wrong, thanks in advance for your help....
 

_______________________
 andrés chandía
 [IMAGE
REMOVED]
 administrador de
 parles.upf.edu
 psicoaching.net
 mapuche
koyaktu
 ong mapuche koyaktu
 P No
imprima innecesariamente. ¡Cuide el medio ambiente!



 
 
 _______________________
             andrés chandía
 
 administrador de
 parles.upf.edu
 psicoaching.net
 mapuche koyaktu
 ong mapuche
koyaktu
 P
 No imprima innecesariamente.
¡Cuide el medio ambiente!

 


_______________________
            andrés
chandía

administrador de
parles.upf.edu
psicoaching.net
mapuche koyaktu
ong mapuche koyaktu
P No imprima innecesariamente. ¡Cuide el medio ambiente!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20131228/901dab05/attachment.html>


More information about the CWB mailing list