[CWB] CQPweb - managing metadata

Claudia Borg claudiaborg at gmail.com
Mon Jul 26 14:28:54 CEST 2010


Hi Andrew,

Re-generating the meta-tables did the job, and now search is also working!

re cqpweb 2.13 - is this available from sourceforge? I only see 2.12 (which
is the one I used).  or this is not an official release as yet?

Thanks
claudia

On 26 July 2010 03:44, Hardie, Andrew <a.hardie at lancaster.ac.uk> wrote:

>  Hi Claudia,
>
>
>
> Apologies for the delay replying; I have been off email. I’ll take your
> original email and updates in reverse order, and answer queries even if you
> answered them yourself later – just in case the answers are useful for
> others.
>
>
>
> “Corpus ``EXAMPLESIMPLE'' is undefined”
>
>
>
> An error like this does indeed indicate that the problem was at the
> corpus-indexing stage.
>
>
>
> On how to tell whether a corpus has been created or not:
>
>
>
> “When using the terminal version, cwb-encode and cwb-make create a bunch
> of additional files....but I am not seeing this happening in the web version
> - how can I check this?”
>
>
>
> These files are placed in the registry and data directories that are
> specified in your configuration file (*lib/config.inc.php*). The variables
> are $cwb_registry and $cwb_datadir. The registry files are placed in the
> former; a directory of actual index files per corpus in the latter. These
> are outside the normal web directory (/var/www or whatever) so that random
> browsers don’t have access to the internal workings of your CWB setup! The
> PHP files created in the web directory are just pointers to different bits
> of the interface.
>
>
>
> “mkdir($datadir, 0775); - this is not working for me, and infact my
> /corpora/data/ is empty, when it should have the folder 'examplesimple' in
> it”**
>
>
>
> As you later worked out, this is a permissions problem. The setup manual
> puts it as follows:
>
> The username of the webserver (in the case of Apache, usually something
> like *www* or *www-data*) need to have read-write-execute access to all
> these directories [that is, the ones you create for CQPweb]. The username of
> the mysqld process (usually something like *mysql*) also needs read and
> write access if you want MySQL to use file-access functions. So, these new
> folders *must* have the write permission set for either “all” or – if you
> are worried about security – for “group” (where the file is assigned to some
> group that both the mysql server's account and the web server's account
> belong to).
>
>
>
> “function create_text_metadata_for_minimalist() contains the call to:
> create_text_metadata_check_text_ids($corpus); however $corpus is an empty
> string and so this was resulting in an sql error”
>
>
>
> Yes, this is a bug in v2.12; it was fixed in 2.13. The solution was exactly
> what you thought – change to *$corpus_sql_name*.
>
>
>
> “if I try say a word lookup, I get an error:
>
> *Error message*
>
> *Syntax error*
> Sorry, your simple query ' f* ' contains a syntax error.
> Usage: $grammar->SetParam($name, $value) at - line 10
>
> I will try and find out what's causing this. any tips?”
>
>
>
> It looks as if one of two things is not working with regard to the tertiary
> annotation and the mapping table.
>
>
>
> *EITHER* your CEQL setup for this corpus is not correct – though it should
> be if you used default p-attributes. Click on “Manage annotation” in the
> admin section of the main menu to check this.
>
>
>
> *OR* the mapping tables that are needed for the query don’t exist. In the
> admin interface under “Misc” click on “mapping tables” and then press the
> button at the bottom of the screen to regenerate built-in mapping tables.
>
>
>
> (And more on all of this here:
> http://cwb.svn.sourceforge.net/viewvc/cwb/gui/cqpweb/trunk/doc/CQPweb-CEQL-manual.html)
>
>
>
> best
>
>
>
> Andrew.
>
>
>
>
>
>
>
> *From:* cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] *On
> Behalf Of *Claudia Borg
> *Sent:* 23 July 2010 18:04
> *To:* Open source development of the Corpus WorkBench
> *Subject:* Re: [CWB] CQPweb - managing metadata
>
>
>
> [second update]
>
> apologies for the several emails.
>
> I have sorted out the php  mkdir problem by setting chmod 777 to the
> following directories:
>
> /corpora/data
> /corpora/registry
> /corpora/system
> /corpora/system/access
> /corpora/system/temp
> /corpora/system/upload
>
> That sorted out the reading and writing to those directories. in fact now I
> see that there are the corpus files in the directory
> /corpora/data/examplesimple showing that at least the input file has been
> processed.
>
> Next problem encountered when creating minimalist metadata:
>
> in admin-lib.inc.php:
>
> line 1454 (approx, since I added some checks):
>
> function create_text_metadata_for_minimalist()
>
> contains the call to:
>     create_text_metadata_check_text_ids($corpus);
> however $corpus is an empty string and so this was resulting in an sql
> error.
>
> I changed this to $corpus_sql_name and it seems to have worked.
>
> I don't know if this is actually a bug or not. But it kind of works so
> far...still not out of the woods though! the metadata installation seems to
> be ok, but if I try say a word lookup, I get an error:
>
> *Error message*
>
> *Syntax error*
> Sorry, your simple query ' f* ' contains a syntax error.
> Usage: $grammar->SetParam($name, $value) at - line 10
>
> I will try and find out what's causing this. any tips?
>
> thanks
> Claudia
>
>
>  On 23 July 2010 17:22, Claudia Borg <claudiaborg at gmail.com> wrote:
>
> [update]
>
> as a matter of fact, in the running of admin-lib.inc.php around line 380+
> there is a piece of code mkdir($datadir, 0775); - this is not working for
> me, and infact my /corpora/data/ is empty, when it should have the folder
> 'examplesimple' in it.
>
> I am running php version 5.3.2-1ubuntu4.2 - I'll try to check why the
> directory is not being created...
>
> claudia
>
>
>
> On 23 July 2010 16:35, Claudia Borg <claudiaborg at gmail.com> wrote:
>
> Hi Andrew, Thanks for your reply. Some more questions from my side to help
> me understand better:
>
> metadata can be extra information about the corpus (e.g. URL, author) and
> not linguistic information present in the text, correct? Linguistic info are
> the p-attributes, and the structure of the text (chapters, ect) are the
> s-attributes - correct?
>
> re removing text_id and text_lang - I've done that and now getting a
> different error :(
>
> What I did:
> placed a new file in the upload section
> installed a new corpus using this file leaving default s- and p-attributes
> clicked on design and insert a text-metadata table link
> left all as is on the form and just clicked the minimalist metadata button
> at the bottom - the result is this:
>
> *Error message*
>
> **** CQP ERROR ****
> CQP Error:
> Corpus ``EXAMPLESIMPLE'' is undefined
>
>
>
> I'll try to have a look at the code - I suspect that the problem is
> actually in the previous step when installing a new corpus - how can I
> confirm that a corpus has been indeed installed correctly? At this stage all
> I notice is that in my /var/www/ I have a new folder for the corpus, with
> some php files inside it - apart from that, I don't see any other changes.
> When using the terminal version, cwb-encode and cwb-make create a bunch of
> additional files....but I am not seeing this happening in the web version -
> how can I check this? ....in the meantime, help is much appreciated!
>
> Claudia
>
>
>   On 21 July 2010 15:44, Hardie, Andrew <a.hardie at lancaster.ac.uk> wrote:
>
>   Hi Claudia,
>
>
>
> Could you try encoding without the explicit text_id and text_lang elements
> in your input file? CQPweb assumes that input files will be valid XML, and
> that s-attributes like text_id and text_lang are to be inferred from the
> attributes of text. So spelling them out may have caused the problem.The
> file  ___install_temp_metadata_illum01 should have been created by
> cwb-s-decode from the text_id s-attribute, so the fact that it was missing
> suggests that this s-attribute is not available.
>
>
>
> On the more general point about metadata: in this case the “minimalist
> metadata” is probably what you want so you are going about it the right way.
> As the manual explains “The metadata file should be a tab-delimited
> database. The first column should be the text id-codes, with a line for each
> text. You can then have as many columns of metadata as you need.” If you
> haven’t got a table of information like this, then the minimalist-metadata
> generates a dummy table for you. “Entering metadata fields” simply means
> specifying what the columns in your table of information contain, so is not
> relevant if you don’t have such a table.
>
>
>
> best
>
>
>
> Andrew.
>
>
>
> *From:* cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] *On
> Behalf Of *Claudia Borg
> *Sent:* 20 July 2010 15:32
> *To:* CWB mailing list
> *Subject:* [CWB] CQPweb - managing metadata
>
>
>
> Hi all,
>
> I am trying to install my own corpus though cqpweb - I have a simple
> vertical text file in the following structure:
>
> <text id="illum01" lang="Maltese">
> <text_id "illum01">
> <text_lang "Maltese">
> <s>
> word1
> word2
> ...
> </s>
> </text_lang>
> </text_id>
> </text>
>
> there is no annotation (pos, lemma, ect) so its basically like a word list.
> The corpus installation process goes well (I used default p-attributes, even
> if in reality I only have word attribute - in future I will add pos and
> lemma but for the time being I am just trying to get used to cqpweb), but
> then I need to install the metadata, and I cannot quite understand what is
> required here.
>
> If I try to create a minimalist metadata table without specifying anything
> in the manage metatdata page, then I get this error:
>
> A mySQL query did not run successfully!
>
> Error # 2:
> File '/home/mlrs/corpora/system/temp/___install_temp_metadata_illum01' not
> found (Errcode: 2)
>
>
>
> from mysql admin, I see that the table text_metadata_for_illum01 has been
> created but it is empty (no rows).
>
> If I try to enter some metadata fields (which I cannot clearly understand
> what's meant to be here), then I still get the above error.
>
> I cannot seem to find anything specific to this problem in the
> documentation (i.e. explaining what metadata should look like, ect.).  I am
> mainly following:
>
> http://cwb.svn.sourceforge.net/viewvc/cwb/gui/cqpweb/trunk/doc/CQPweb-installing-corpora.html
>
> Any pointers would be appreciated.
>
> Regards
> Claudia
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>
>
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20100726/6e5b5183/attachment-0001.htm


More information about the CWB mailing list