[CWB] Problem encoding corpus with POS tags

Albert Gatt albert.gatt at um.edu.mt
Mon Nov 5 15:08:45 CET 2012


I'm trying to install a corpus which has word + POS, via CQPWeb. An example
of the data is shown below:

<text id="lh1">
<s id="0">
Anqas MV
għaraftek VV
... PUN
</s>
...
</text>

When I install, I leave the s-attributes as default (since "s" is the only
structural attribute I have, apart from "text") and specify "pos" as the
primary p-attribute.

The corpus installs without problems, and I can use CQPWeb's frequency list
functionality to see a list of different parts of speech, as well as word
tokens. I can successfully run queries for words. However, any query that
involves POS gives me no results (e.g. "kien_VA" where "kien" is a word and
"VA" is a tag).

I'm not sure where the problem lies.

thanks
albert

-- 
-----------------------------------------------------------------
Albert Gatt
Institute of Linguistics
Rm 22, Block A
Car Park 6
University of Malta
Tal-Qroqq Msida MSD2080
Malta

tel: (+356) 2340 2150
http://staff.um.edu.mt/albert.gatt/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121105/b2137a19/attachment.html>


More information about the CWB mailing list