[CWB] News texts in CQPWeb

Kurt Sultana kurtanatlus at gmail.com
Sun Jan 27 10:21:25 CET 2013


Bumped into an interesting mail post and I've put in *
news:0+title+source+date*, *s* and *p:0+id *now as s-attributes. Seems to
be working now. Could anyone confirm I'm doing this right?

Thanks,
Kurt


On Sat, Jan 26, 2013 at 7:41 PM, Kurt Sultana <kurtanatlus at gmail.com> wrote:

> Hi,
>
> I've dug up a bit and have come to know that the attributes I mentioned
> are stored as s-attributes. So, I have this example text:
>
> <news title="A Thrilling Experience" date="01/01/2013" source="
> www.timesofmalta.com">
> <text id="4">
> <p id="1">
> <s>
> Tick    NN    tick
> .    SENT     .
> </s>
> <s>
> A    DT     a
> clock    NN    clock
> .    SENT    .
> </s>
> <s>
> Tick    VB    tick
> ,    ,    ,
> tick    VB    tick
> .    SENT    .
> </s>
> </p>
> </text>
> </news>
>
> As s-attributes (XML elements) I put in *p*, *p_id*, *news*, *news_title*,
> *news_source* and *news_date*. Upon installing the corpus, I select to
> install metadata via xml annotated within the corpus and select *
> news_title*, *news_source* and *news_date* however when I click on
> "Create metadata table from XML using settings above", I get this error:
>
> *Error message*
> **** CQP ERROR ****
> CQP Error:
> No annotated values for s-attribute ``news_title'' in named query c_M_F_xml
>
>
> I'm not 100% confident of what I'm doing since it's my first time, so I
> might have easily misunderstood something. What am I doing wrong?
>
> Many thanks in advance,
> Kurt
>
>
>
> On Thu, Jan 24, 2013 at 10:39 PM, Kurt Sultana <kurtanatlus at gmail.com>wrote:
>
>> Hi all,
>>
>> I have a news corpus which I'd like to put in CQPWeb.
>>
>> I'm currently representing a news text (in Maltese) like this:
>> <text id="1">
>> <s>
>> L NP
>> - PUN
>> armi VV
>> nxtraw VV
>> separatament MV
>> minn PRP
>> l- DDC
>> istess MJ
>> kollezzjonista NN
>> anonimu NN
>> minn PRP
>> Texas NP
>> . PUN
>> </s>
>> <s>
>> Dan PD
>> ifisser VV
>> li CMP
>> l- DDC
>> armi NN
>> anke CC
>> wara PRP
>> li CMP
>> nbiegħu VV
>> se PAF
>> jibqgħu VV
>> flimkien MV
>> . PUN
>> </s>
>> </text>
>>
>> A news text, apart from text, usually contains the title and date of
>> publication. How could I include this information in the above, for
>> example? Would these take the form of attributes? And could I run queries
>> against these new attributes?
>>
>> Thanks in advance,
>> Kurt
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20130127/c1b714bd/attachment.html>


More information about the CWB mailing list