[CWB] Indexing recursively

Thomas Zastrow thomas.zastrow at rzg.mpg.de
Fri Aug 25 10:38:22 CEST 2017


Dear all,

I have a problem with a CQP indexed corpus I created from German 
Wikipedia. Everything looks fine, the "data" folder is about 10 GB and 
the indexing process showed no errors. When I go into the CQP cmd, I can 
activate the corpus:

------------------------------------------------------------------
cqp -r /data/wp/2017/data/cqp/registry
[no corpus]> WIKIPEDIA;
------------------------------------------------------------------

The prompt shows now "WIKIPEDIA". Also showing infos works - partially? 
- fine:

------------------------------------------------------------------
WIKIPEDIA> info WIKIPEDIA;
Warning:
     Can't open info file /data/wp/2017/data/cqp/data/.info for reading
Size:    782308286
Charset: latin1
Properties:
         language = '??'
         charset = 'latin1'
------------------------------------------------------------------

Also context description looks good:

------------------------------------------------------------------
show cd;
===Context Descriptor=======================================

left context:     25 characters
right context:    25 characters
corpus position:  shown
target anchors:   not shown

Positional Attributes:  * word
                           pos
                           lemma

Structural Attributes:    s

Aligned Corpora:          <none>

============================================================
------------------------------------------------------------------

But unfortunately, searching for anything don't work at all:

------------------------------------------------------------------
WIKIPEDIA> "der";
0 matches.
------------------------------------------------------------------

I'm glad for any help ;-)

Thanks,

Tom


-- 
Dr. Thomas Zastrow
Max Planck Computing and Data Facility (MPCDF)
Gießenbachstr. 2, D-85748 Garching bei München, Germany
Tel +49-89-3299-1457
http://www.mpcdf.de



More information about the CWB mailing list