[CWB] How to not adjust the XML visualisation display in the CQPWeb concordance KWIC view

Chao Sun chao.sun at sydney.edu.au
Wed Jun 6 08:20:21 CEST 2018


Hi Andrew,

Thanks for the explanation, we coded that POS differently so the slash is not a problem.

* Is there an option for displaying the frequency list of word form in lower case only?

* And also we found the search syntax {light/V} that combining both lemma and tag doesn’t work.
Is it anything different from {light}_V, which works alright so far?

Cheers,
Chao


Dr CHAO SUN | Data Scientist
Faculty of Arts and Social Sciences | The University of Sydney
Rm N302 off Lobby J, Quadrangle A14 | The University of Sydney | NSW | 2006
T +61 2 9351 4240  | F +61 2 9351 5333
E chao.sun at sydney.edu.au<mailto:chao.sun at sydney.edu.au> | W sydney.edu.au<http://sydney.edu.au>
CRICOS 00026A

On 19 May 2018, at 3:57 am, Hardie, Andrew <a.hardie at lancaster.ac.uk<mailto:a.hardie at lancaster.ac.uk>> wrote:

Hi Chao,

Re -- Is there any way to exclude the tag from the query result column, and only display it simply as context?

Alas not. It could be made to do so by using different visualisation filters for cols 1, 2 and 3 (just as different filters can be used for concordance and extended features) but I’m afraid that this would be low-priority on my list of enhancements…

Arguably, the open tag in the middle column should instead be at the end of the left column. Unfortunately, there is no distinction in a CQP concordance between an XML tag before a token that is part of the context vs one that is part of the hit. What I mean is that if you search for (say)

<u> "That"

and again for

"That"

Then there would be no way to treat the XML tags adjacent to the results to the former as part of the hit, but as part of the context in the latter case – because the CQP concordance would show the identical output based on an identical  underlying query result representation (same corpus position numbers).

Re – “I don’t understand why this becomes /IN_that when the corpus is installed from vrt file, and the “that_IN/that” become “that/IN_that”.”

Because “/” is used as the word/tag delimiter character in the CQP concordance – so when splitting up word and tag, CQPweb looks for the final “/” in each word. (Note that the CQP concordance doesn’t distinguish between delimiter “/” and a “/” that comes from the token index).

The limitations of the CQP concordance format are a bit of a  pain in the neck frankly. We plan to change it to something way more systematic for v4.

best

Andrew.

From: cwb-bounces at sslmit.unibo.it<mailto:cwb-bounces at sslmit.unibo.it> [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Chao Sun
Sent: 18 May 2018 01:41
To: CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
Subject: [CWB] How to not adjust the XML visualisation display in the CQPWeb concordance KWIC view


Hello all,

I have a few questions about tweaking the display in the latest CQPWeb interface (Ver 3.2.31).

I have defined an XML tag (speaker tag) to be displayed in the visualisation, it’s not part of the word forms.
However when the first word after the tag is queried, the tag get displayed within the query result column, like shown in the screenshot.
Is there any way to exclude the tag from the query result column, and only display it simply as context?

Also, in the 2nd (4th, 5th) row, the word “that” is not shown correctly, and this is due to the POS tag for this word being IN/that.
I don’t understand why this becomes /IN_that when the corpus is installed from vrt file, and the “that_IN/that” become “that/IN_that”.
Is there any specific setup required for this POS tag?

And one last question about the frequency list. The words in the word form list seem to be shown as the first instance in the corpus (or maybe not),
Therefore some of the words have capital letters and some others are all lower case. Is there a way to display all words in frequency list as lower case only?

Best Regards,
Chao
<image001.png>


Dr CHAO SUN | Data Scientist
Faculty of Arts and Social Sciences | The University of Sydney
Rm N302 off Lobby J, Quadrangle A14 | The University of Sydney | NSW | 2006
T +61 2 9351 4240  | F +61 2 9351 5333
E chao.sun at sydney.edu.au<mailto:chao.sun at sydney.edu.au> | W sydney.edu.au<http://sydney.edu.au/>
CRICOS 00026A

_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
https://protect-au.mimecast.com/s/x_SgCoVzGQiJo93Ls10KBD?domain=liste.sslmit.unibo.it

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20180606/02bf085d/attachment-0001.html>


More information about the CWB mailing list