[CWB] CQPweb - managing metadata

Claudia Borg claudiaborg at gmail.com
Fri Jul 23 19:03:30 CEST 2010


[second update]

apologies for the several emails.

I have sorted out the php  mkdir problem by setting chmod 777 to the
following directories:

/corpora/data
/corpora/registry
/corpora/system
/corpora/system/access
/corpora/system/temp
/corpora/system/upload

That sorted out the reading and writing to those directories. in fact now I
see that there are the corpus files in the directory
/corpora/data/examplesimple showing that at least the input file has been
processed.

Next problem encountered when creating minimalist metadata:

in admin-lib.inc.php:

line 1454 (approx, since I added some checks):

function create_text_metadata_for_minimalist()

contains the call to:
    create_text_metadata_check_text_ids($corpus);
however $corpus is an empty string and so this was resulting in an sql
error.

I changed this to $corpus_sql_name and it seems to have worked.

I don't know if this is actually a bug or not. But it kind of works so
far...still not out of the woods though! the metadata installation seems to
be ok, but if I try say a word lookup, I get an error:

*Error message*
*Syntax error*
Sorry, your simple query ' f* ' contains a syntax error.
Usage: $grammar->SetParam($name, $value) at - line 10

I will try and find out what's causing this. any tips?

thanks
Claudia



On 23 July 2010 17:22, Claudia Borg <claudiaborg at gmail.com> wrote:

> [update]
>
> as a matter of fact, in the running of admin-lib.inc.php around line 380+
> there is a piece of code mkdir($datadir, 0775); - this is not working for
> me, and infact my /corpora/data/ is empty, when it should have the folder
> 'examplesimple' in it.
>
> I am running php version 5.3.2-1ubuntu4.2 - I'll try to check why the
> directory is not being created...
>
> claudia
>
> On 23 July 2010 16:35, Claudia Borg <claudiaborg at gmail.com> wrote:
>
>> Hi Andrew, Thanks for your reply. Some more questions from my side to help
>> me understand better:
>>
>> metadata can be extra information about the corpus (e.g. URL, author) and
>> not linguistic information present in the text, correct? Linguistic info are
>> the p-attributes, and the structure of the text (chapters, ect) are the
>> s-attributes - correct?
>>
>> re removing text_id and text_lang - I've done that and now getting a
>> different error :(
>>
>> What I did:
>> placed a new file in the upload section
>> installed a new corpus using this file leaving default s- and p-attributes
>> clicked on design and insert a text-metadata table link
>> left all as is on the form and just clicked the minimalist metadata button
>> at the bottom - the result is this:
>>
>> *Error message*
>> **** CQP ERROR ****
>> CQP Error:
>> Corpus ``EXAMPLESIMPLE'' is undefined
>>
>>
>>
>> I'll try to have a look at the code - I suspect that the problem is
>> actually in the previous step when installing a new corpus - how can I
>> confirm that a corpus has been indeed installed correctly? At this stage all
>> I notice is that in my /var/www/ I have a new folder for the corpus, with
>> some php files inside it - apart from that, I don't see any other changes.
>> When using the terminal version, cwb-encode and cwb-make create a bunch of
>> additional files....but I am not seeing this happening in the web version -
>> how can I check this? ....in the meantime, help is much appreciated!
>>
>> Claudia
>>
>>
>>
>> On 21 July 2010 15:44, Hardie, Andrew <a.hardie at lancaster.ac.uk> wrote:
>>
>>>  Hi Claudia,
>>>
>>>
>>>
>>> Could you try encoding without the explicit text_id and text_lang
>>> elements in your input file? CQPweb assumes that input files will be valid
>>> XML, and that s-attributes like text_id and text_lang are to be inferred
>>> from the attributes of text. So spelling them out may have caused the
>>> problem.The file  ___install_temp_metadata_illum01 should have been
>>> created by cwb-s-decode from the text_id s-attribute, so the fact that it
>>> was missing suggests that this s-attribute is not available.
>>>
>>>
>>>
>>> On the more general point about metadata: in this case the “minimalist
>>> metadata” is probably what you want so you are going about it the right way.
>>> As the manual explains “The metadata file should be a tab-delimited
>>> database. The first column should be the text id-codes, with a line for each
>>> text. You can then have as many columns of metadata as you need.” If you
>>> haven’t got a table of information like this, then the minimalist-metadata
>>> generates a dummy table for you. “Entering metadata fields” simply means
>>> specifying what the columns in your table of information contain, so is not
>>> relevant if you don’t have such a table.
>>>
>>>
>>>
>>> best
>>>
>>>
>>>
>>> Andrew.
>>>
>>>
>>>
>>> *From:* cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it]
>>> *On Behalf Of *Claudia Borg
>>> *Sent:* 20 July 2010 15:32
>>> *To:* CWB mailing list
>>> *Subject:* [CWB] CQPweb - managing metadata
>>>
>>>
>>>
>>> Hi all,
>>>
>>> I am trying to install my own corpus though cqpweb - I have a simple
>>> vertical text file in the following structure:
>>>
>>> <text id="illum01" lang="Maltese">
>>> <text_id "illum01">
>>> <text_lang "Maltese">
>>> <s>
>>> word1
>>> word2
>>> ...
>>> </s>
>>> </text_lang>
>>> </text_id>
>>> </text>
>>>
>>> there is no annotation (pos, lemma, ect) so its basically like a word
>>> list. The corpus installation process goes well (I used default
>>> p-attributes, even if in reality I only have word attribute - in future I
>>> will add pos and lemma but for the time being I am just trying to get used
>>> to cqpweb), but then I need to install the metadata, and I cannot quite
>>> understand what is required here.
>>>
>>> If I try to create a minimalist metadata table without specifying
>>> anything in the manage metatdata page, then I get this error:
>>>
>>> A mySQL query did not run successfully!
>>>
>>> Error # 2:
>>> File '/home/mlrs/corpora/system/temp/___install_temp_metadata_illum01'
>>> not found (Errcode: 2)
>>>
>>>
>>>
>>> from mysql admin, I see that the table text_metadata_for_illum01 has been
>>> created but it is empty (no rows).
>>>
>>> If I try to enter some metadata fields (which I cannot clearly understand
>>> what's meant to be here), then I still get the above error.
>>>
>>> I cannot seem to find anything specific to this problem in the
>>> documentation (i.e. explaining what metadata should look like, ect.).  I am
>>> mainly following:
>>>
>>> http://cwb.svn.sourceforge.net/viewvc/cwb/gui/cqpweb/trunk/doc/CQPweb-installing-corpora.html
>>>
>>> Any pointers would be appreciated.
>>>
>>> Regards
>>> Claudia
>>>
>>>
>>> _______________________________________________
>>> CWB mailing list
>>> CWB at sslmit.unibo.it
>>> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20100723/c1117642/attachment.htm


More information about the CWB mailing list