[CWB] Format for metadata files?

Jiayue Wang arthur0421 at gmail.com
Sat Dec 3 22:17:42 CET 2016


Hi,
I pasted the error message in Geany and found that each text id is led 
by an invisible character (between the ' and the first visible letter).

Jiayue

On 03/12/16 17:19, Graham Ranger -- UAPV wrote:
> Hello,
> I'm getting the following error message when I try to load the metadata
> file for a corpus:
>
> The data source you specified for the text metadata contains
> badly-formatted text ID codes, as follows: <strong>
> 'assollant_rose_d_amour'; 'bruno_le_tour_de_la_france';
> 'bruyere_l_epee_de_charlemagne'; 'daudet_lettres_de_mon_moulin';
> 'malot_sans_famille'; 'marcel_les_petits_vagabonds';
> 'robida_les_assieges_de_compiegne'; 'segur_malheurs_de_sophie';
> 'segur_un_bon_petit_diable'; 'verne_cinq_semaines_en_ballon';
> 'verne_le_tour_du_monde'; 'zola_nouveaux_contes_a_ninon';</strong>
> (text ids can only contain unaccented letters, numbers, and underscore).
>
> The metadata is in a file called jeunesse.meta in which each line begins
> with the text id of the texts in the corpus.
> Inside the metadata file, the lines read as follows:
>
> assollant_rose_d_amour    alfred_assollant    rose_d_amour 1889
> 1850_1899    roman    avance
> bruno_le_tour_de_la_france    bruno    le_tour_de_la_france 1877
> 1850-1899    manuel_scolaire    elementaire
> etc.
>
> with text id, author, title, date, period, genre and level.
>
> I can't see what is wrong with the file: the error message suggests that
> it's formatted as <strong>, but it's just plain text!
> Thanks as always for any help.
> Best,
> Graham.
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb


More information about the CWB mailing list