[CWB] Zero matches in BNC

Aleksandar Trklja aleksandar.trklja at univie.ac.at
Thu May 9 10:29:17 CEST 2019


Hi Andrew,

Thank you so much for your clarification.

"/home/corp/tma/" is actually a directory. It seems this problem arose
because I tried to encode an already existing directory. After I've
created a new directory that error message didn't appear. 

But, now I get the following error message at the end of the encoding
process:

...
Building indices and compressing data ...
      3 <list> regions dropped because of deep nesting.
     14 <item> regions dropped because of deep nesting.
      8 <hi> regions dropped because of deep nesting.
      3 <p> regions dropped because of deep nesting.
/usr/local/share/cwb/registry/bnc: Permission denied
Can't create registry entry in file /usr/local/share/cwb/registry/bnc!
[location of error: input line #81145224]
CWB::Encoder: Error in cwb-encode pipe (). 

A

Am 03.05.2019 20:21, schrieb Hardie, Andrew: 

> Hi Aleks,
> 
> The language variable is not really relevant. It not being set means
> nothing. The size would seem t be wrong, though, as 26 million is
> nowhere near enough. Something may have gone wrong in the encoding
> process at that point that has left the lexicon and/or the index
> unfinished (thus the search failure).
> 
> Also, is your BNC data directory actually /home/corp/tma/ or is it a
> subdirectory of that? The latter would indicate something amiss if CQP
> is looking for the .info file (which usually doesn't exist) in the
> parent directory. You might check what paths are given in the registry
> file, perhaps.
> 
> Hope that helps
> 
> best
> 
> Andrew.
> 
> FROM: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> ON
> BEHALF OF Aleksandar Trklja
> SENT: 03 May 2019 15:21
> TO: cwb at sslmit.unibo.it
> SUBJECT: [CWB] Zero matches in BNC
> IMPORTANCE: High
> 
> Dear all,
> 
> I've re-encoded BNC with 'EncodeBNC.perl' and 'cqp' now returns zero
> matches. It seems that both Positional and Structural Attributes have
> been properly encoded (see below) but it seems that the language
> variable was not properly assigned. This is what 'info' shows:
> 
> BNC> INFO
> Warning:
> Can't open info file /home/corp/tma/.info for reading
> Size:    26142145
> Charset: latin1
> Properties:
> language = '??'
> charset = 'latin1'
> 
> BNC> "THE"
> 0 matches.
> 
> BNC> SHOW CD
> 
> ===Context Descriptor=======================================
> 
> left context:     25 characters
> right context:    25 characters
> corpus position:  shown
> target anchors:   not shown
> 
> Positional Attributes:  * word
> pos
> lemma
> hw
> class
> type
> flags_before
> space_after
> offset
> 
> Structural Attributes:    text
> text_id              [A]
> text_title           [A]
> text_n_words         [A]
> text_n_tokens        [A]
> text_n_w             [A]
> text_n_c             [A]
> text_n_s             [A]
> text_publication_date [A]
> text_text_type       [A]
> text_context         [A]
> text_respondent_age  [A]
> text_respondent_class [A]
> text_respondent_sex  [A]
> text_interaction_type [A]
> text_region          [A]
> text_author_age      [A]
> text_author_domicile [A]
> text_author_sex      [A]
> text_author_type     [A]
> text_audience_age    [A]
> text_domain          [A]
> text_difficulty      [A]
> text_medium          [A]
> ...
> Any suggestions? Thank you.
> 
> Best
> 
> Aleks
> 
> --
> 
> _Dr Aleksandar Trklja_
> _Senior Lecturer_
> _Department of Translation Studies_
> 
> _University of Vienna_
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb

--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20190509/022585fd/attachment.html>


More information about the CWB mailing list