[CWB] corpus setup problem on CQPweb 3.0.5: cannot create metadata , Error # 1062

=?GBK?B?zuLBvMa9?= liangpingwu at 126.com
Fri May 18 14:50:47 CEST 2012


Hi there,
This is Ray Wu from China. This is my first time being here. My solution to you all on the list.

I am a college ESL teacher and just a few days ago I started to learn CQPweb, both for teaching and research. Now I am fiddling with a toy corpus on CQPweb to get initiated.

My toy corpus is as suggested in Andrew's paper: test.vrt  (encoded in ISO-8859-1)
<text id="test">
<s>
The AT0 the
cat NN1 cat
sat VVD sit
on  PRP on
the AT0 the
mat NN1 mat
.   PUN .
</s>
<s>
Many    DT0 many
cats    NN2 cat
sit VVB sit
on  PRP on
mats    NN2 mat
.   PUN .
</s>
</text>

My metadata file: test_meta.dat (encoded in ISO-8859-1, tab separated)
text_id genre   sampled
test    press   all               

Presently, the corpus can be loaded into CQPweb without any fuss. But when I arrived at the "Admin tools for managing corpus metadata" page, I hit upon an error. Here are my then choices on that page:
field 1    genre    classification
field    2    sampled    classification

After clicking "install metadata table using the settings above", I got an error message like this:
A mySQL query did not run successfully!
Error # 1062: Duplicate entry 'test-__HANDLE' for key 1

I then peeped into MySQL and found the following 4 tables concerning metadata:
mysql> show tables;
...
corpus_metadata_fixed (empty)
corpus_metadata_variable (empty)
...
text_metadata_fields
text_metadata_values  (empty)
...

mysql> select * from text_metadata_fields;
+--------+----------+-------------+-------------------+
| corpus | handle   | description | is_classification |
+--------+----------+-------------+-------------------+
| test   | genre        |             |                 1 |
| test   | sampled      |             |                 1 |
| test   | __HANDLE     |             |                 0 |
+--------+----------+-------------+-------------------+

But if I click "create minimalist metadata table", a metadata table can be created successfully and I could start to query.
mysql> select * from text_metadata_for_test;
+---------+-------+-----------+---------+
| text_id | words | cqp_begin | cqp_end |
+---------+-------+-----------+---------+
| test       |    13     |         0 |      12    |
+---------+-------+-----------+---------+

This frustrates me as I know that without metadata, a corpus is of little value for search/research. Has anyone encountered similar messages before?

I have browsed all the archived mailinglist but found no direct answer to this problem (but I haven't looked at the source code yet). I don't know whether this indicates I need to manually add a few columns to the text_metadata_for_test table or I have just missed something important to get it done. Thanks for any pointers.

My thanks also goes to Andrew for a previous help regarding a CQPweb 3.0.5 file-write permission problem in a personal emai and pointing me at here. Thank you, Andrew.

PS: my computer paratemers:
System: Ubuntu 8.04
Apache: 2.0.63
MySQL: 5.0.88
PHP: 5.2.12 (lower than expected 5.3.0)
Perl: 5.8.8
CWB: 3.0.0
Linux utilites: awk, tar, gzip, iconv


Wu Liangping
School of International Studies
Hunan University of Commerce
PO Box  410000
Changsha, China 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20120518/b505de08/attachment.htm


More information about the CWB mailing list