[CWB] query the s-attributes

Hardie, Andrew a.hardie at lancaster.ac.uk
Sun Sep 30 20:44:59 CEST 2018


Well, this query means "find any token where the text_titol_or attribute equals anything".  So naturally it retrieved the entire corpus, because every token matches that criterion.

If you want a list of text titles, do just this query:

<text> []

And then use the Download Tabulation function to download the "text_titol_or" attribute for each result.

From: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> On Behalf Of "Andrés Chandía"
Sent: 30 September 2018 19:41
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Subject: Re: [CWB] query the s-attributes

Sorry, this time due to the rush I didn't use it, but when I use it I just get:

Your query "a:[] []* :: a.text_titol_or = ".*";" returned 42,689 matches in 50 different texts (in 42,689 words [50 texts]; frequency: 1,000,000.00 instances per million words)

instead of a list of the text_titol_or attributes, which are far less than 42,689



El Dom, 30 de Septiembre de 2018, 20:33, Hardie, Andrew escribió:
You are using CQPweb "simple query syntax" (ie CEQL) when your query actually uses full CQP syntax. You need to switch to CQP query mode.
From: cwb-bounces at sslmit.unibo.it On Behalf Of "Andrés Chandía"
Sent: 30 September 2018 19:31
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] query the s-attributes
[]* ;

**Error:** only a single ''_'' separator allowed between word form and POS constraint (use ''\_'' to match literal underscore) - when parsing '' []* ; '' - when parsing '' []* ; '' as **phrase_query** - when parsing '' []* ; '' as **default**
a:[] []* :: a.text_text_titol_or = ".*";

**Error:** empty list of alternatives not allowed in wildcard pattern - when parsing '' ] '' as **wildcard_item** - at this location: '' a: [ ] ''** a:[] ''** :: a.text_text_titol_or = ".*"; '' - when parsing '' a:[] []* :: a.text_text_titol_or = ".*"; '' as **phrase_query** - when parsing '' a:[] []* :: a.text_text_titol_or = ".*"; '' as **default**
El Dom, 30 de Septiembre de 2018, 20:25, Hardie, Andrew escribió:
It would help if you could actually tell us what the error message is.
(I have an idea what the error is, but unless I see the message, I can't be sure)
Best
Andrew.
From: cwb-bounces at sslmit.unibo.it On Behalf Of "Andrés Chandía"
Sent: 30 September 2018 19:23
To: Open source development of the Corpus WorkBench
Subject: Re: [CWB] query the s-attributes
At cqp web I tried all of this with no success (error message)
[]* ;
[]* ;
[]* ;
[]* ;
a:[] []* :: a.text_text_titol_or = ".*";
etc....
and all the variants that are in http://cwb.sourceforge.net/files/CQP_Tutorial/node27.html
El Dom, 30 de Septiembre de 2018, 19:54, Hardie, Andrew escribió:
See:
http://cwb.sourceforge.net/files/CQP_Tutorial/node24.html
http://cwb.sourceforge.net/files/CQP_Tutorial/node26.html
http://cwb.sourceforge.net/files/CWB_Encoding_Tutorial/node7.html
http://cwb.sourceforge.net/files/CWB_Encoding_Tutorial/node8.html
best
Andrew.
From: cwb-bounces at sslmit.unibo.it On Behalf Of "Andrés Chandía"
Sent: 30 September 2018 18:43
To: cwb at sslmit.unibo.it
Subject: [CWB] query the s-attributes
Can the s-attributes be queried, I would say yes, but I can not find the way....
for instalnce the corpus is composed by many texts each one of them has its own title="whaterver" at the xml label, so can I obtain a list of all titles?, how?



_______________________
andrés chandía
[IMAGE REMOVED]<http://www.chandia.net>
Dungupeyem<http://chandia.net/content/dungupeyem> | IECMap<http://chandia.net/content/iecmap> | ISECMap<http://chandia.net/content/isecmap> | NMT<http://chandia.net/content/nmt> | Corlexim<http://corlexim.cl>

administrador de:
Parles.upf<http://parles.upf.edu> | IWCH<https://iwch.upf.edu> | Amind terapia<http://amindterapia.com> | ONG Mapuche koyaktu<http://koyaktumapuche.net> | Nocando<http://parles.upf.edu/llocs/nocando> | IAC<https://iac.upf.edu> | CddZ<https://iac.upf.edu/cddz> | ISAC<https://iac.upf.edu/isac> | CatCg<http://catcg.upf.edu>
P No imprima innecesariamente. ¡Cuide el medio ambiente!



_______________________
andrés chandía
[IMAGE REMOVED]<http://www.chandia.net>
Dungupeyem<http://chandia.net/content/dungupeyem> | IECMap<http://chandia.net/content/iecmap> | ISECMap<http://chandia.net/content/isecmap> | NMT<http://chandia.net/content/nmt> | Corlexim<http://corlexim.cl>

administrador de:
Parles.upf<http://parles.upf.edu> | IWCH<https://iwch.upf.edu> | Amind terapia<http://amindterapia.com> | ONG Mapuche koyaktu<http://koyaktumapuche.net> | Nocando<http://parles.upf.edu/llocs/nocando> | IAC<https://iac.upf.edu> | CddZ<https://iac.upf.edu/cddz> | ISAC<https://iac.upf.edu/isac> | CatCg<http://catcg.upf.edu>
P No imprima innecesariamente. ¡Cuide el medio ambiente!



_______________________
andrés chandía
[IMAGE REMOVED]<http://www.chandia.net>
Dungupeyem<http://chandia.net/content/dungupeyem> | IECMap<http://chandia.net/content/iecmap> | ISECMap<http://chandia.net/content/isecmap> | NMT<http://chandia.net/content/nmt> | Corlexim<http://corlexim.cl>

administrador de:
Parles.upf<http://parles.upf.edu> | IWCH<https://iwch.upf.edu> | Amind terapia<http://amindterapia.com> | ONG Mapuche koyaktu<http://koyaktumapuche.net> | Nocando<http://parles.upf.edu/llocs/nocando> | IAC<https://iac.upf.edu> | CddZ<https://iac.upf.edu/cddz> | ISAC<https://iac.upf.edu/isac> | CatCg<http://catcg.upf.edu>
P No imprima innecesariamente. ¡Cuide el medio ambiente!



_______________________
            andrés chandía
[Image removed by sender. chandia.net]<http://www.chandia.net>[Image removed by sender.]<https://twitter.com/chandianet>
Dungupeyem<http://chandia.net/content/dungupeyem> | IECMap<http://chandia.net/content/iecmap> | ISECMap<http://chandia.net/content/isecmap> | NMT<http://chandia.net/content/nmt> | Corlexim<http://corlexim.cl>

administrador de:
Parles.upf<http://parles.upf.edu> | IWCH<https://iwch.upf.edu> | Amind terapia<http://amindterapia.com> | ONG Mapuche koyaktu<http://koyaktumapuche.net> | Nocando<http://parles.upf.edu/llocs/nocando> | IAC<https://iac.upf.edu> | CddZ<https://iac.upf.edu/cddz> | ISAC<https://iac.upf.edu/isac> | CatCg<http://catcg.upf.edu>
P No imprima innecesariamente. ¡Cuide el medio ambiente!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180930/664d38ff/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ~WRD000.jpg
Type: image/jpeg
Size: 823 bytes
Desc: ~WRD000.jpg
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180930/664d38ff/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 338 bytes
Desc: image001.jpg
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180930/664d38ff/attachment-0003.jpg>


More information about the CWB mailing list