<div dir="ltr">No, I didn't get any messages when I use the frequency list controls.<div><br></div><div>I am using CQPwebinabox <span style="color:rgb(0,0,0);font-family:Verdana,Tahoma,"DejaVu Sans",Arial,sans-serif">Esmeralda (</span><span style="color:rgb(0,0,0);font-family:Verdana,Tahoma,"DejaVu Sans",Arial,sans-serif">CQPweb 3.2.11) and CWB 3.4.8(checked by using "cqb -v").</span></div><div><span style="color:rgb(0,0,0);font-family:Verdana,Tahoma,"DejaVu Sans",Arial,sans-serif"><br></span></div><div><font color="#000000" face="Verdana, Tahoma, DejaVu Sans, Arial, sans-serif">Regards,</font></div><div><font color="#000000" face="Verdana, Tahoma, DejaVu Sans, Arial, sans-serif">Lai</font></div></div><div class="gmail_extra"><br><div class="gmail_quote">2018-06-19 23:06 GMT+08:00 Hardie, Andrew <span dir="ltr"><<a href="mailto:a.hardie@lancaster.ac.uk" target="_blank">a.hardie@lancaster.ac.uk</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-GB" link="blue" vlink="purple">
<div class="m_4807034719559244244WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">Did you get any odd messages when you ran the frequency-list setup on CQPweb?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">If not – what version of the code do you have?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">best<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d">Andrew.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span lang="EN-US">From:</span></b><span lang="EN-US"> <a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.unibo.it</a> [mailto:<a href="mailto:cwb-bounces@sslmit.unibo.it" target="_blank">cwb-bounces@sslmit.<wbr>unibo.it</a>]
<b>On Behalf Of </b>Hermann Lai<br>
<b>Sent:</b> 19 June 2018 11:32<br>
<b>To:</b> Open source development of the Corpus WorkBench <<a href="mailto:cwb@sslmit.unibo.it" target="_blank">cwb@sslmit.unibo.it</a>><br>
<b>Subject:</b> Re: [CWB] Incorrect total words count in a Traditional Chinese corpus on CQPweb<u></u><u></u></span></p><div><div class="h5">
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt">part of the output of "cwb-decode -C CANTON1 -ALL | less"</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><s></span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><text></span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><text_id T01></span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">中環</span><span style="font-size:10.5pt"> N </span><span style="font-size:10.5pt;font-family:"MS Gothic"">中環</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">保育</span><span style="font-size:10.5pt"> V </span><span style="font-size:10.5pt;font-family:"MS Gothic"">保育</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">奇觀</span><span style="font-size:10.5pt"> N </span><span style="font-size:10.5pt;font-family:"MS Gothic"">奇觀</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">:</span><span style="font-size:10.5pt"> PU
</span><span style="font-size:10.5pt;font-family:"MS Gothic"">:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">孫中山</span><span style="font-size:10.5pt"> N </span><span style="font-size:10.5pt;font-family:"MS Gothic"">孫中山</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">史蹟</span><span style="font-size:10.5pt"> N </span><span style="font-size:10.5pt;font-family:"MS Gothic"">史蹟</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">徑</span><span style="font-size:10.5pt"> N </span><span style="font-size:10.5pt;font-family:"MS Gothic"">徑</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">至</span><span style="font-size:10.5pt"> CONJ
</span><span style="font-size:10.5pt;font-family:"MS Gothic"">至</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"MS Gothic"">大館</span><span style="font-size:10.5pt"> N </span><span style="font-size:10.5pt;font-family:"MS Gothic"">大館</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"></text_id></span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"></text></span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"></s></span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt"><u></u> <u></u></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt">part of the output of "cwb-described-corpus -s CANTON1"</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">==============================<wbr>==============================<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Corpus: CANTON1<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">==============================<wbr>==============================<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">description: <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">registry file: /usr/local/share/cwb/registry/<wbr>canton1<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">home directory: /usr/local/corpora/data/<wbr>canton1/<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">info file: /usr/local/corpora/data/<wbr>canton1/.info<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">size (tokens): 23<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"> 3 positional attributes<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> 3 structural attributes<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> 0 alignment attributes<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">p-ATT word 23 tokens, 22 types<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">p-ATT pos 23 tokens, 8 types<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">p-ATT lemma 23 tokens, 22 types<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">s-ATT s 2 regions<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">s-ATT text 2 regions<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">s-ATT text_id 2 regions (with annotations)<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">It seems that CWB can recognize the number of words but CQPweb doesn't.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Regards,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Lai<u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">2018-06-19 15:43 GMT+08:00 Stefan Evert <<a href="mailto:stefanML@collocations.de" target="_blank">stefanML@collocations.de</a>>:<u></u><u></u></p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<p class="MsoNormal">What does the corpus look like if you decode it from the CWB index with the following command?<br>
<br>
cwb-decode -C CANTON1 -ALL | less<br>
<br>
Can you show us part of the output? It would also be useful to see the output of<br>
<br>
cwb-described-corpus -s CANTON1<br>
<br>
<br>
One possibility I can think of is that your linebreaks are messed up so that CWB treats everything within the text region as a single long line.
<br>
<br>
Best,<br>
Stefan<br>
<br>
<br>
> On 19 Jun 2018, at 09:26, Hermann Lai <<a href="mailto:halflifelai@gmail.com" target="_blank">halflifelai@gmail.com</a>> wrote:<br>
> <br>
> I am using CQPwebinabox and I have indexed a Traditonal Chinese corpus called "canton1" by using two commands:<br>
> <br>
> sudo cwb-encode -d /usr/local/corpora/data/<wbr>canton1 -f /home/user/Desktop/corpora/<wbr>canton1/canton1.vrt -R /usr/local/share/cwb/registry/<wbr>canton1 -c utf8 -xsB -P pos -P lemma -S s:0 -S text:0+id<br>
> <br>
> sudo cwb-make -V CANTON1<br>
> <br>
> After that, I install the corpus onto CQPweb. Most of the thing are correct. However, the total number of corpus texts is as same as the total words in all corpus texts.<br>
<br>
______________________________<wbr>_________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it" target="_blank">CWB@sslmit.unibo.it</a><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" target="_blank">http://liste.sslmit.unibo.it/<wbr>mailman/listinfo/cwb</a><u></u><u></u></p>
</blockquote>
</div>
<p class="MsoNormal"><br>
<br clear="all">
<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<p class="MsoNormal">-- <u></u><u></u></p>
<div>
<div>
<div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<p class="MsoNormal"><i><span style="font-size:13.5pt;font-family:"Times New Roman",serif">Gaspard Germannson</span></i><u></u><u></u></p>
</blockquote>
</div>
</div>
</div>
</div>
</div></div></div>
</div>
<br>______________________________<wbr>_________________<br>
CWB mailing list<br>
<a href="mailto:CWB@sslmit.unibo.it">CWB@sslmit.unibo.it</a><br>
<a href="http://liste.sslmit.unibo.it/mailman/listinfo/cwb" rel="noreferrer" target="_blank">http://liste.sslmit.unibo.it/<wbr>mailman/listinfo/cwb</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><blockquote style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><i><font face="times new roman, serif" size="4">Gaspard Germannson</font></i></blockquote></div></div></div>
</div>