[CWB] CQPweb - managing metadata

Claudia Borg claudiaborg at gmail.com
Fri Jul 23 16:35:14 CEST 2010


Hi Andrew, Thanks for your reply. Some more questions from my side to help
me understand better:

metadata can be extra information about the corpus (e.g. URL, author) and
not linguistic information present in the text, correct? Linguistic info are
the p-attributes, and the structure of the text (chapters, ect) are the
s-attributes - correct?

re removing text_id and text_lang - I've done that and now getting a
different error :(

What I did:
placed a new file in the upload section
installed a new corpus using this file leaving default s- and p-attributes
clicked on design and insert a text-metadata table link
left all as is on the form and just clicked the minimalist metadata button
at the bottom - the result is this:

*Error message*
**** CQP ERROR ****
CQP Error:
Corpus ``EXAMPLESIMPLE'' is undefined



I'll try to have a look at the code - I suspect that the problem is actually
in the previous step when installing a new corpus - how can I confirm that a
corpus has been indeed installed correctly? At this stage all I notice is
that in my /var/www/ I have a new folder for the corpus, with some php files
inside it - apart from that, I don't see any other changes. When using the
terminal version, cwb-encode and cwb-make create a bunch of additional
files....but I am not seeing this happening in the web version - how can I
check this? ....in the meantime, help is much appreciated!

Claudia



On 21 July 2010 15:44, Hardie, Andrew <a.hardie at lancaster.ac.uk> wrote:

>  Hi Claudia,
>
>
>
> Could you try encoding without the explicit text_id and text_lang elements
> in your input file? CQPweb assumes that input files will be valid XML, and
> that s-attributes like text_id and text_lang are to be inferred from the
> attributes of text. So spelling them out may have caused the problem.The
> file  ___install_temp_metadata_illum01 should have been created by
> cwb-s-decode from the text_id s-attribute, so the fact that it was missing
> suggests that this s-attribute is not available.
>
>
>
> On the more general point about metadata: in this case the “minimalist
> metadata” is probably what you want so you are going about it the right way.
> As the manual explains “The metadata file should be a tab-delimited
> database. The first column should be the text id-codes, with a line for each
> text. You can then have as many columns of metadata as you need.” If you
> haven’t got a table of information like this, then the minimalist-metadata
> generates a dummy table for you. “Entering metadata fields” simply means
> specifying what the columns in your table of information contain, so is not
> relevant if you don’t have such a table.
>
>
>
> best
>
>
>
> Andrew.
>
>
>
> *From:* cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] *On
> Behalf Of *Claudia Borg
> *Sent:* 20 July 2010 15:32
> *To:* CWB mailing list
> *Subject:* [CWB] CQPweb - managing metadata
>
>
>
> Hi all,
>
> I am trying to install my own corpus though cqpweb - I have a simple
> vertical text file in the following structure:
>
> <text id="illum01" lang="Maltese">
> <text_id "illum01">
> <text_lang "Maltese">
> <s>
> word1
> word2
> ...
> </s>
> </text_lang>
> </text_id>
> </text>
>
> there is no annotation (pos, lemma, ect) so its basically like a word list.
> The corpus installation process goes well (I used default p-attributes, even
> if in reality I only have word attribute - in future I will add pos and
> lemma but for the time being I am just trying to get used to cqpweb), but
> then I need to install the metadata, and I cannot quite understand what is
> required here.
>
> If I try to create a minimalist metadata table without specifying anything
> in the manage metatdata page, then I get this error:
>
> A mySQL query did not run successfully!
>
> Error # 2:
> File '/home/mlrs/corpora/system/temp/___install_temp_metadata_illum01' not
> found (Errcode: 2)
>
>
>
> from mysql admin, I see that the table text_metadata_for_illum01 has been
> created but it is empty (no rows).
>
> If I try to enter some metadata fields (which I cannot clearly understand
> what's meant to be here), then I still get the above error.
>
> I cannot seem to find anything specific to this problem in the
> documentation (i.e. explaining what metadata should look like, ect.).  I am
> mainly following:
>
> http://cwb.svn.sourceforge.net/viewvc/cwb/gui/cqpweb/trunk/doc/CQPweb-installing-corpora.html
>
> Any pointers would be appreciated.
>
> Regards
> Claudia
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20100723/6fc82af1/attachment.htm


More information about the CWB mailing list