[CWB] Indexing recursively
Jörg Knappen
j.knappen at mx.uni-saarland.de
Fri Aug 25 14:55:55 CEST 2017
Hallo Thomas,
we had this problem before, it occurs when you forget to rum
cwb-makeall
the syptom is, that you can search for [word=".*"] (with many hits as
expected), but you get no hits for a search with [word="d.*"] at all.
Viele Grüße aus Saarbrücken,
Jörg
Zitat von Thomas Zastrow <thomas.zastrow at rzg.mpg.de>:
> Dear all,
>
> I have a problem with a CQP indexed corpus I created from German
> Wikipedia. Everything looks fine, the "data" folder is about 10 GB
> and the indexing process showed no errors. When I go into the CQP
> cmd, I can activate the corpus:
>
> ------------------------------------------------------------------
> cqp -r /data/wp/2017/data/cqp/registry
> [no corpus]> WIKIPEDIA;
> ------------------------------------------------------------------
>
> The prompt shows now "WIKIPEDIA". Also showing infos works -
> partially? - fine:
>
> ------------------------------------------------------------------
> WIKIPEDIA> info WIKIPEDIA;
> Warning:
> Can't open info file /data/wp/2017/data/cqp/data/.info for reading
> Size: 782308286
> Charset: latin1
> Properties:
> language = '??'
> charset = 'latin1'
> ------------------------------------------------------------------
>
> Also context description looks good:
>
> ------------------------------------------------------------------
> show cd;
> ===Context Descriptor=======================================
>
> left context: 25 characters
> right context: 25 characters
> corpus position: shown
> target anchors: not shown
>
> Positional Attributes: * word
> pos
> lemma
>
> Structural Attributes: s
>
> Aligned Corpora: <none>
>
> ============================================================
> ------------------------------------------------------------------
>
> But unfortunately, searching for anything don't work at all:
>
> ------------------------------------------------------------------
> WIKIPEDIA> "der";
> 0 matches.
> ------------------------------------------------------------------
>
> I'm glad for any help ;-)
>
> Thanks,
>
> Tom
>
>
> --
> Dr. Thomas Zastrow
> Max Planck Computing and Data Facility (MPCDF)
> Gießenbachstr. 2, D-85748 Garching bei München, Germany
> Tel +49-89-3299-1457
> http://www.mpcdf.de
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list