[CWB] Indexing of metadata problem + no display of query results

Noah Bubenhofer bubenhofer at cl.uzh.ch
Fri Nov 20 14:46:40 CET 2015


In addition to Andrew's answer: I often had the error complaining about
lines outside <text>. The not so obvious reason for that (after having
checked the xml structure in depth) were the text id's: of course, a
unique id per text is necessary. But you should know, that cqpweb allows
id's of a max length of I think about 25 characters and I sometimes had
longer ones. Cqpweb then just crops the id's and as a result you may
have non unique id's in your data... CQP does not complain about that,
but CQPweb...

Perhaps this is also the case in your data.

Noah



Am 20.11.15 um 13:10 schrieb Emmanuel CARTIER:
> Hi,
> 
> I am currently working with the last version of CWB and with CQPWeb
> version 3.0.16.
> I managed to index big corpora (from 100 to 500 Go) on the command line
> and install the corpora on CQPWeb.
> 
> I have two problems:
> A.
> When I launch the offline-freq-list.php (php
> ../bin/offline-freqlists.php <corpora name in lowercase) for generating
> metadata indexes, it generates the following error :
> </pre>
> <p class="errormessage">CQPweb encountered an error and could not
> continue.</p>
> 
> <p class="errormessage">Unexpected line outside &lt;text&gt; tags while
> creating corpus
>                     POLOGNE_2015__FREQ! -- creation aborted</p>
> 
> <p class="errormessage">... in file
> <b>/var/www/CQPweb/lib/freqtable-cwb.inc.php</b> line <b>177</b>.</p>
> 
> Afterwards, it does not unable to query the corpus, but can you indicate
> me some hints to debug it?
> 
> B. When querying my corpus (pos-tagged with treetagger, then post
> processed to transform <unknown> lemma to "unknown") with the following
> CQP query [lemma="unknown'], the web interface always ends with a blank
> page. But when I use the commandline cqp utility, it is outputing the
> results normally. can you give me some hints on that?
> 
> Thanks a lot for your work and help,
> 
> Emmanuel
> 
> 

-- 
Universität Zürich
Institut für Computerlinguistik
Projekt "Visual Linguistics"
Binzmühlestr. 14
CH-8050 Zürich

www.bubenhofer.com
www.visual-linguistics.net
bubenhofer at cl.uzh.ch (PGP-Schlüssel vorhanden)
Tel. +41 44 635 67 18
Büro 2.A.14


More information about the CWB mailing list