[CWB] News texts in CQPWeb

Kurt Sultana kurtanatlus at gmail.com
Sat Jan 26 19:41:42 CET 2013


Hi,

I've dug up a bit and have come to know that the attributes I mentioned are
stored as s-attributes. So, I have this example text:

<news title="A Thrilling Experience" date="01/01/2013" source="
www.timesofmalta.com">
<text id="4">
<p id="1">
<s>
Tick    NN    tick
.    SENT     .
</s>
<s>
A    DT     a
clock    NN    clock
.    SENT    .
</s>
<s>
Tick    VB    tick
,    ,    ,
tick    VB    tick
.    SENT    .
</s>
</p>
</text>
</news>

As s-attributes (XML elements) I put in *p*, *p_id*, *news*, *news_title*, *
news_source* and *news_date*. Upon installing the corpus, I select to
install metadata via xml annotated within the corpus and select *news_title*,
*news_source* and *news_date* however when I click on "Create metadata
table from XML using settings above", I get this error:

*Error message*
**** CQP ERROR ****
CQP Error:
No annotated values for s-attribute ``news_title'' in named query c_M_F_xml


I'm not 100% confident of what I'm doing since it's my first time, so I
might have easily misunderstood something. What am I doing wrong?

Many thanks in advance,
Kurt


On Thu, Jan 24, 2013 at 10:39 PM, Kurt Sultana <kurtanatlus at gmail.com>wrote:

> Hi all,
>
> I have a news corpus which I'd like to put in CQPWeb.
>
> I'm currently representing a news text (in Maltese) like this:
> <text id="1">
> <s>
> L NP
> - PUN
> armi VV
> nxtraw VV
> separatament MV
> minn PRP
> l- DDC
> istess MJ
> kollezzjonista NN
> anonimu NN
> minn PRP
> Texas NP
> . PUN
> </s>
> <s>
> Dan PD
> ifisser VV
> li CMP
> l- DDC
> armi NN
> anke CC
> wara PRP
> li CMP
> nbiegħu VV
> se PAF
> jibqgħu VV
> flimkien MV
> . PUN
> </s>
> </text>
>
> A news text, apart from text, usually contains the title and date of
> publication. How could I include this information in the above, for
> example? Would these take the form of attributes? And could I run queries
> against these new attributes?
>
> Thanks in advance,
> Kurt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20130126/ca3f02cc/attachment.html>


More information about the CWB mailing list