[CWB] Indexing recursively

Thomas Zastrow thomas.zastrow at rzg.mpg.de
Fri Aug 25 17:54:48 CEST 2017


Hello Jörg,

Thanks a lot, indeed, that was the problem ;-)

Greetings to you and all the colleagues in Saarbrücken!

Tom



Am 25.08.2017 um 14:55 schrieb Jörg Knappen:
> Hallo Thomas,
>
> we had this problem before, it occurs when you forget to rum
>
> cwb-makeall
>
> the syptom is, that you can search for [word=".*"] (with many hits as 
> expected), but you get no hits for a search with [word="d.*"] at all.
>
> Viele Grüße aus Saarbrücken,
>
> Jörg
>
> Zitat von Thomas Zastrow <thomas.zastrow at rzg.mpg.de>:
>
>> Dear all,
>>
>> I have a problem with a CQP indexed corpus I created from German 
>> Wikipedia. Everything looks fine, the "data" folder is about 10 GB 
>> and the indexing process showed no errors. When I go into the CQP 
>> cmd, I can activate the corpus:
>>
>> ------------------------------------------------------------------
>> cqp -r /data/wp/2017/data/cqp/registry
>> [no corpus]> WIKIPEDIA;
>> ------------------------------------------------------------------
>>
>> The prompt shows now "WIKIPEDIA". Also showing infos works - 
>> partially? - fine:
>>
>> ------------------------------------------------------------------
>> WIKIPEDIA> info WIKIPEDIA;
>> Warning:
>>     Can't open info file /data/wp/2017/data/cqp/data/.info for reading
>> Size:    782308286
>> Charset: latin1
>> Properties:
>>         language = '??'
>>         charset = 'latin1'
>> ------------------------------------------------------------------
>>
>> Also context description looks good:
>>
>> ------------------------------------------------------------------
>> show cd;
>> ===Context Descriptor=======================================
>>
>> left context:     25 characters
>> right context:    25 characters
>> corpus position:  shown
>> target anchors:   not shown
>>
>> Positional Attributes:  * word
>>                           pos
>>                           lemma
>>
>> Structural Attributes:    s
>>
>> Aligned Corpora:          <none>
>>
>> ============================================================
>> ------------------------------------------------------------------
>>
>> But unfortunately, searching for anything don't work at all:
>>
>> ------------------------------------------------------------------
>> WIKIPEDIA> "der";
>> 0 matches.
>> ------------------------------------------------------------------
>>
>> I'm glad for any help ;-)
>>
>> Thanks,
>>
>> Tom
>>
>>
>> -- 
>> Dr. Thomas Zastrow
>> Max Planck Computing and Data Facility (MPCDF)
>> Gießenbachstr. 2, D-85748 Garching bei München, Germany
>> Tel +49-89-3299-1457
>> http://www.mpcdf.de
>>
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb



More information about the CWB mailing list