[CWB] Problem encoding corpus with POS tags

Hardie, Andrew a.hardie at lancaster.ac.uk
Mon Nov 5 15:34:31 CET 2012


Hi Albert,

You may need to check whether pos has been configured properly as primary annotation. 

As a superuser, go to the main corpus search page then on the menu select > Manage Annotation. See if the "Primary annotation" slot has POS selected. If not, change and update, then it should work.

If, on the other hand, pos *IS* properly selected on that screen, let me know, and I'll look into what else might be causing the problem.

(I am not sure why sometimes the primary annotation is not selected correctly at index time. A bug, of course, but none one I've managed to track down yet as it seems to be intermittent. I'll work it out eventually.)

best

Andrew.

==========================
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Albert Gatt
Sent: 05 November 2012 14:09
To: cwb at sslmit.unibo.it
Subject: [CWB] Problem encoding corpus with POS tags

I'm trying to install a corpus which has word + POS, via CQPWeb. An example of the data is shown below:

<text id="lh1">
<s id="0">
Anqas	MV
għaraftek	VV
...	PUN
</s>
...
</text>

When I install, I leave the s-attributes as default (since "s" is the only structural attribute I have, apart from "text") and specify "pos" as the primary p-attribute.

The corpus installs without problems, and I can use CQPWeb's frequency list functionality to see a list of different parts of speech, as well as word tokens. I can successfully run queries for words. However, any query that involves POS gives me no results (e.g. "kien_VA" where "kien" is a word and "VA" is a tag). 

I'm not sure where the problem lies.

thanks
albert


-- 
-----------------------------------------------------------------
Albert Gatt
Institute of Linguistics 
Rm 22, Block A
Car Park 6
University of Malta
Tal-Qroqq Msida MSD2080
Malta
 
tel: (+356) 2340 2150
http://staff.um.edu.mt/albert.gatt/



More information about the CWB mailing list