[CWB] Indexing recursively

Jörg Knappen j.knappen at mx.uni-saarland.de
Fri Aug 25 14:55:55 CEST 2017


Hallo Thomas,

we had this problem before, it occurs when you forget to rum

cwb-makeall

the syptom is, that you can search for [word=".*"] (with many hits as  
expected), but you get no hits for a search with [word="d.*"] at all.

Viele Grüße aus Saarbrücken,

Jörg

Zitat von Thomas Zastrow <thomas.zastrow at rzg.mpg.de>:

> Dear all,
>
> I have a problem with a CQP indexed corpus I created from German  
> Wikipedia. Everything looks fine, the "data" folder is about 10 GB  
> and the indexing process showed no errors. When I go into the CQP  
> cmd, I can activate the corpus:
>
> ------------------------------------------------------------------
> cqp -r /data/wp/2017/data/cqp/registry
> [no corpus]> WIKIPEDIA;
> ------------------------------------------------------------------
>
> The prompt shows now "WIKIPEDIA". Also showing infos works -  
> partially? - fine:
>
> ------------------------------------------------------------------
> WIKIPEDIA> info WIKIPEDIA;
> Warning:
>     Can't open info file /data/wp/2017/data/cqp/data/.info for reading
> Size:    782308286
> Charset: latin1
> Properties:
>         language = '??'
>         charset = 'latin1'
> ------------------------------------------------------------------
>
> Also context description looks good:
>
> ------------------------------------------------------------------
> show cd;
> ===Context Descriptor=======================================
>
> left context:     25 characters
> right context:    25 characters
> corpus position:  shown
> target anchors:   not shown
>
> Positional Attributes:  * word
>                           pos
>                           lemma
>
> Structural Attributes:    s
>
> Aligned Corpora:          <none>
>
> ============================================================
> ------------------------------------------------------------------
>
> But unfortunately, searching for anything don't work at all:
>
> ------------------------------------------------------------------
> WIKIPEDIA> "der";
> 0 matches.
> ------------------------------------------------------------------
>
> I'm glad for any help ;-)
>
> Thanks,
>
> Tom
>
>
> -- 
> Dr. Thomas Zastrow
> Max Planck Computing and Data Facility (MPCDF)
> Gießenbachstr. 2, D-85748 Garching bei München, Germany
> Tel +49-89-3299-1457
> http://www.mpcdf.de
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb





More information about the CWB mailing list