[CWB] Announcement: Another CWB/CQPweb setup in China

Ray Wu liangpingwu at 126.com
Fri Oct 26 07:21:41 CEST 2012


I guess to remove the confusion for future visitors completely, maybe we have to reinstall the corpus, provided we have the time in hand.
I remember a similar mistake I've made months ago and the best solution turned out to be a delete-and-install process. 
Yes, it's easy for us to accept the default settings (which do not necessarily match our own corpus) when clicking around a lot of buttons.


Best,
Ray


At 2012-10-26 12:39:15,"Xu Jiajin" <ustcxujj at gmail.com> wrote:
Hi JM, Andrew, and Ray,

The only thing I could do for the Corpus Info section of the lefthand menu of CQPweb is the Corpus documentation.

In our Icelandic corpus interface (http://124.193.83.252/cqp/IcePaHC/, ID: test, Pass: test), I added the link of the official site of the Historical Icelandic corpus to http://www.linguist.is/icelandic_treebank/Icelandic_Parsed_Historical_Corpus_%28IcePaHC%29, which provides all useful information of the corpus, including the tagset used (http://www.linguist.is/icelandic_treebank/Tagset), and the download links. Andrew's trial use of Q-A tag has to be the parsed part of the corpus, as the corpus has been both PoS-tagged and parsed.

I hope the information above helps.

Best,

Jiajin

Jiajin XU
Ph.D., associate professor
National Research Centre for Foreign Language Education
Beijing Foreign Studies University
Beijing 100089
China
Email: xujiajin at bfsu.edu.cn




On Fri, Oct 26, 2012 at 2:10 AM, "Andrés Chandía" <andres at chandia.net> wrote:
Hi JM
We've been able of making a query like the one you describe, our tagset is personalized so if we want to look for a "Name folloewd by an Adjective" we do next at the SQL: "_N* _A*" (wothout the doublequotes

if we want to look for a secondary tag like, let say gender, the query is like this: {M} (for masculine)

that means:
query for primary anotation tag = _*
query for secondary anotation tag = {*}

What we haven't been able to do is find the combination to query for a "Noun Masculine" for instance, we have tried many combinations with no success ( _N{M} - {N/M}, etc.) so if somebody could help us with this we would appreciate it a lot.

@ch


El Jue, 25 de Octubre de 2012, 13:04, Josep M. Fontana escribió:

Hi,

I am a little (or quite) confused about the syntax of CQPweb queries (simple query language). I went to the wonderful resource Ray Wu has made available so that I could see how it works since we are in the process of installing CQPweb as an interface for our corpora. I wasn't able to complete any search using the simple query language, though. I'm sure it is something very simple that I am missing. From what I understand reading the document 'simple query language syntax', I should be able to do the following in the simple query mode:

_JJ _NN1

which would supposedly look for sequences of an adjective followed by noun according to the CLAWS tag set.

OK, I'm conducting the searches in the Old Icelandic Corpus which has been supposedly tagged using the CLAWS7 tagset (according to the information in "View corpus metadata". When I do this, however, I get a message saying "Your query had no results. There are no matches for your query." This is very puzzling because you would imagine that there would be occurrences of adjectives followed by nouns. Doing it the opposite order (_NN1 _JJ) gives me the same results. What is even more puzzling is that I also get nothing using single POS labels such as _NN1 by itself or _JJ.

Am I doing something wrong or is this due to the fact that this particular corpus uses a completely different tagset? When you access a CQPWeb corpus, is there any way to retrieve the tags that have been used in the corpus? The only relevant info I find in this corpus is the link to the CLAWS7 tagset but, as I said, this doesn't seem to be the right information. Going into the CQP syntax mode and doing "show +pos" doesn't work.


JM


Dear members,

We are pleased to announce another CWB/CQPweb setup in China and we dub it BFSU CQPweb. It is closely modelled after Hardie's own (sorry Andrew, we're badly in need of imagination) and currently features more than 20 corpora, including two Brown family cousins (CLOB and Crown) developed at Beijing Foreign Studies Unversity by Dr. Xu Jiajing and Professor Liang Maocheng.

You may access it from http://124.193.83.252/cqp/ using test/test as username/password.

We'd like to take this opportunity to thank the CWB team for their wonderful work and generosity. It is great fun to build our work on their shoulders.


Best,
Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20121026/a3d6e5f5/attachment-0001.html>


More information about the CWB mailing list