[CWB] News texts in CQPWeb
Hardie, Andrew
a.hardie at lancaster.ac.uk
Mon Jan 28 12:31:43 CET 2013
As Martí says, that’s quite right (the CQPweb form currently just drops these straight through to the CWB tools, so the cwb-encode formalism is needed – a more intuitive web user interface will be provided at some point). Sorry for not getting to these mails over the weekend!
best
Andrew.
From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Kurt Sultana
Sent: 27 January 2013 09:21
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] News texts in CQPWeb
Bumped into an interesting mail post and I've put in news:0+title+source+date, s and p:0+id now as s-attributes. Seems to be working now. Could anyone confirm I'm doing this right?
Thanks,
Kurt
On Sat, Jan 26, 2013 at 7:41 PM, Kurt Sultana <kurtanatlus at gmail.com<mailto:kurtanatlus at gmail.com>> wrote:
Hi,
I've dug up a bit and have come to know that the attributes I mentioned are stored as s-attributes. So, I have this example text:
<news title="A Thrilling Experience" date="01/01/2013" source="www.timesofmalta.com<http://www.timesofmalta.com>">
<text id="4">
<p id="1">
<s>
Tick NN tick
. SENT .
</s>
<s>
A DT a
clock NN clock
. SENT .
</s>
<s>
Tick VB tick
, , ,
tick VB tick
. SENT .
</s>
</p>
</text>
</news>
As s-attributes (XML elements) I put in p, p_id, news, news_title, news_source and news_date. Upon installing the corpus, I select to install metadata via xml annotated within the corpus and select news_title, news_source and news_date however when I click on "Create metadata table from XML using settings above", I get this error:
Error message
**** CQP ERROR ****
CQP Error:
No annotated values for s-attribute ``news_title'' in named query c_M_F_xml
I'm not 100% confident of what I'm doing since it's my first time, so I might have easily misunderstood something. What am I doing wrong?
Many thanks in advance,
Kurt
On Thu, Jan 24, 2013 at 10:39 PM, Kurt Sultana <kurtanatlus at gmail.com<mailto:kurtanatlus at gmail.com>> wrote:
Hi all,
I have a news corpus which I'd like to put in CQPWeb.
I'm currently representing a news text (in Maltese) like this:
<text id="1">
<s>
L NP
- PUN
armi VV
nxtraw VV
separatament MV
minn PRP
l- DDC
istess MJ
kollezzjonista NN
anonimu NN
minn PRP
Texas NP
. PUN
</s>
<s>
Dan PD
ifisser VV
li CMP
l- DDC
armi NN
anke CC
wara PRP
li CMP
nbiegħu VV
se PAF
jibqgħu VV
flimkien MV
. PUN
</s>
</text>
A news text, apart from text, usually contains the title and date of publication. How could I include this information in the above, for example? Would these take the form of attributes? And could I run queries against these new attributes?
Thanks in advance,
Kurt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20130128/fee0670c/attachment.html>
More information about the CWB
mailing list