[CWB] Indexing recursively
Thomas Zastrow
thomas.zastrow at rzg.mpg.de
Fri Aug 25 17:54:48 CEST 2017
Hello Jörg,
Thanks a lot, indeed, that was the problem ;-)
Greetings to you and all the colleagues in Saarbrücken!
Tom
Am 25.08.2017 um 14:55 schrieb Jörg Knappen:
> Hallo Thomas,
>
> we had this problem before, it occurs when you forget to rum
>
> cwb-makeall
>
> the syptom is, that you can search for [word=".*"] (with many hits as
> expected), but you get no hits for a search with [word="d.*"] at all.
>
> Viele Grüße aus Saarbrücken,
>
> Jörg
>
> Zitat von Thomas Zastrow <thomas.zastrow at rzg.mpg.de>:
>
>> Dear all,
>>
>> I have a problem with a CQP indexed corpus I created from German
>> Wikipedia. Everything looks fine, the "data" folder is about 10 GB
>> and the indexing process showed no errors. When I go into the CQP
>> cmd, I can activate the corpus:
>>
>> ------------------------------------------------------------------
>> cqp -r /data/wp/2017/data/cqp/registry
>> [no corpus]> WIKIPEDIA;
>> ------------------------------------------------------------------
>>
>> The prompt shows now "WIKIPEDIA". Also showing infos works -
>> partially? - fine:
>>
>> ------------------------------------------------------------------
>> WIKIPEDIA> info WIKIPEDIA;
>> Warning:
>> Can't open info file /data/wp/2017/data/cqp/data/.info for reading
>> Size: 782308286
>> Charset: latin1
>> Properties:
>> language = '??'
>> charset = 'latin1'
>> ------------------------------------------------------------------
>>
>> Also context description looks good:
>>
>> ------------------------------------------------------------------
>> show cd;
>> ===Context Descriptor=======================================
>>
>> left context: 25 characters
>> right context: 25 characters
>> corpus position: shown
>> target anchors: not shown
>>
>> Positional Attributes: * word
>> pos
>> lemma
>>
>> Structural Attributes: s
>>
>> Aligned Corpora: <none>
>>
>> ============================================================
>> ------------------------------------------------------------------
>>
>> But unfortunately, searching for anything don't work at all:
>>
>> ------------------------------------------------------------------
>> WIKIPEDIA> "der";
>> 0 matches.
>> ------------------------------------------------------------------
>>
>> I'm glad for any help ;-)
>>
>> Thanks,
>>
>> Tom
>>
>>
>> --
>> Dr. Thomas Zastrow
>> Max Planck Computing and Data Facility (MPCDF)
>> Gießenbachstr. 2, D-85748 Garching bei München, Germany
>> Tel +49-89-3299-1457
>> http://www.mpcdf.de
>>
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list