[CWB] Zero matches in BNC

Hardie, Andrew a.hardie at lancaster.ac.uk
Thu May 9 11:58:07 CEST 2019


This is the key error:

/usr/local/share/cwb/registry/bnc: Permission denied

The username you are running under doesn’t have permission to create a file in that folder. Perhaps you need to change the permissions, or else specify a different registry.

best

Andrew.

From: Aleksandar Trklja <aleksandar.trklja at univie.ac.at>
Sent: 09 May 2019 09:29
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Cc: Hardie, Andrew <a.hardie at lancaster.ac.uk>
Subject: Re: [CWB] Zero matches in BNC

Hi Andrew,

Thank you so much for your clarification.

"/home/corp/tma/" is actually a directory. It seems this problem arose because I tried to encode an already existing directory. After I've created a new directory that error message didn't appear.

But, now I get the following error message at the end of the encoding process:

...
Building indices and compressing data ...
      3 <list> regions dropped because of deep nesting.
     14 <item> regions dropped because of deep nesting.
      8 <hi> regions dropped because of deep nesting.
      3 <p> regions dropped because of deep nesting.
/usr/local/share/cwb/registry/bnc: Permission denied
Can't create registry entry in file /usr/local/share/cwb/registry/bnc!
[location of error: input line #81145224]
CWB::Encoder: Error in cwb-encode pipe ().


A

Am 03.05.2019 20:21, schrieb Hardie, Andrew:
Hi Aleks,

The language variable is not really relevant. It not being set means
nothing. The size would seem t be wrong, though, as 26 million is
nowhere near enough. Something may have gone wrong in the encoding
process at that point that has left the lexicon and/or the index
unfinished (thus the search failure).

Also, is your BNC data directory actually /home/corp/tma/ or is it a
subdirectory of that? The latter would indicate something amiss if CQP
is looking for the .info file (which usually doesn’t exist) in the
parent directory. You might check what paths are given in the registry
file, perhaps.

Hope that helps

best

Andrew.

FROM: cwb-bounces at sslmit.unibo.it<mailto:cwb-bounces at sslmit.unibo.it> <cwb-bounces at sslmit.unibo.it<mailto:cwb-bounces at sslmit.unibo.it>> ON
BEHALF OF Aleksandar Trklja
SENT: 03 May 2019 15:21
TO: cwb at sslmit.unibo.it<mailto:cwb at sslmit.unibo.it>
SUBJECT: [CWB] Zero matches in BNC
IMPORTANCE: High

Dear all,

I've re-encoded BNC with 'EncodeBNC.perl' and 'cqp' now returns zero
matches. It seems that both Positional and Structural Attributes have
been properly encoded (see below) but it seems that the language
variable was not properly assigned. This is what 'info' shows:

BNC> INFO
Warning:
    Can't open info file /home/corp/tma/.info for reading
Size:    26142145
Charset: latin1
Properties:
        language = '??'
        charset = 'latin1'

BNC> "THE"
0 matches.

BNC> SHOW CD

===Context Descriptor=======================================

left context:     25 characters
right context:    25 characters
corpus position:  shown
target anchors:   not shown

Positional Attributes:  * word
                          pos
                          lemma
                          hw
                          class
                          type
                          flags_before
                          space_after
                          offset

Structural Attributes:    text
                          text_id              [A]
                          text_title           [A]
                          text_n_words         [A]
                          text_n_tokens        [A]
                          text_n_w             [A]
                          text_n_c             [A]
                          text_n_s             [A]
                          text_publication_date [A]
                          text_text_type       [A]
                          text_context         [A]
                          text_respondent_age  [A]
                          text_respondent_class [A]
                          text_respondent_sex  [A]
                          text_interaction_type [A]
                          text_region          [A]
                          text_author_age      [A]
                          text_author_domicile [A]
                          text_author_sex      [A]
                          text_author_type     [A]
                          text_audience_age    [A]
                          text_domain          [A]
                          text_difficulty      [A]
                          text_medium          [A]
                        ...
Any suggestions? Thank you.

Best

Aleks

--

_Dr Aleksandar Trklja_
_Senior Lecturer_
_Department of Translation Studies_

_University of Vienna_
_______________________________________________
CWB mailing list
CWB at sslmit.unibo.it<mailto:CWB at sslmit.unibo.it>
http://liste.sslmit.unibo.it/mailman/listinfo/cwb

--

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20190509/e17c4ffc/attachment-0001.html>


More information about the CWB mailing list