[CWB] Indexing recursively
Thomas Zastrow
thomas.zastrow at rzg.mpg.de
Fri Aug 25 10:38:22 CEST 2017
Dear all,
I have a problem with a CQP indexed corpus I created from German
Wikipedia. Everything looks fine, the "data" folder is about 10 GB and
the indexing process showed no errors. When I go into the CQP cmd, I can
activate the corpus:
------------------------------------------------------------------
cqp -r /data/wp/2017/data/cqp/registry
[no corpus]> WIKIPEDIA;
------------------------------------------------------------------
The prompt shows now "WIKIPEDIA". Also showing infos works - partially?
- fine:
------------------------------------------------------------------
WIKIPEDIA> info WIKIPEDIA;
Warning:
Can't open info file /data/wp/2017/data/cqp/data/.info for reading
Size: 782308286
Charset: latin1
Properties:
language = '??'
charset = 'latin1'
------------------------------------------------------------------
Also context description looks good:
------------------------------------------------------------------
show cd;
===Context Descriptor=======================================
left context: 25 characters
right context: 25 characters
corpus position: shown
target anchors: not shown
Positional Attributes: * word
pos
lemma
Structural Attributes: s
Aligned Corpora: <none>
============================================================
------------------------------------------------------------------
But unfortunately, searching for anything don't work at all:
------------------------------------------------------------------
WIKIPEDIA> "der";
0 matches.
------------------------------------------------------------------
I'm glad for any help ;-)
Thanks,
Tom
--
Dr. Thomas Zastrow
Max Planck Computing and Data Facility (MPCDF)
Gießenbachstr. 2, D-85748 Garching bei München, Germany
Tel +49-89-3299-1457
http://www.mpcdf.de
More information about the CWB
mailing list