[CWB] WebInABox: Can't import existing corpora from host

Scott Sadowsky ssadowsky at gmail.com
Tue Jul 26 18:52:47 CEST 2016


On Tue, Jul 26, 2016 at 12:18 PM, Hardie, Andrew <a.hardie at lancaster.ac.uk>
wrote:

>>> But how do I restrict searches using the s-attributes (say, speaker
> sex)? When I do a query and then select "Distribution", for example, I'm
> told that "This corpus has no text-classification metadata, so the
> distribution cannot be shown".
>
> ·         Go to Restricted query
>
> ·         You should see options to restrict your query to XML segments
> where the given attribute has a particular category handle for any s-att
> that you set to datatype “Classifcation”
>
Thanks. That makes sense.

When I run one of these queries, though, CQPweb throws an SQL error (pasted
below).

·         OR, go to “Create / edit subcorpora” and define subcorpora using
> the same control, then use those SCs as restriction criteria.
>
This also throws an error (also pasted below).


>  Note that non-text-based corpus restrictions and subcorpora aren’t
> currently supported in the Distribution display. I know this is a pain, and
> it’s high on my feature list. (but quite a big job so can’t be done
> quickly!)
>
I can only imagine!

Thanks again,
Scott



===== ERROR 1 =====
CQPweb encountered an error and could not continue.
A MySQL query did not run successfully!

Original query: SELECT count(*), sum(words) FROM
text_metadata_for_test_coscach WHERE /* from User: user | Function:
do_append_mysql_comment() | 2016-Jul-26 16:42:47 */

Error # 1064: You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near
'' at line 2

PHP debugging backtrace

array(7) {
  [1]=>
  array(4) {
    ["file"]=>
    string(40) "/var/www/html/cqpweb/lib/library.inc.php"
    ["line"]=>
    int(282)
    ["function"]=>
    string(20) "exiterror_mysqlquery"
    ["args"]=>
    array(3) {
      [0]=>
      &int(1064)
      [1]=>
      &string(146) "You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to use
near '' at line 2"
      [2]=>
      &string(156) "SELECT count(*), sum(words) FROM
text_metadata_for_test_coscach WHERE
/* from User: user | Function: do_append_mysql_comment() | 2016-Jul-26
16:42:47 */"
    }
  }
  [2]=>
  array(4) {
    ["file"]=>
    string(42) "/var/www/html/cqpweb/lib/subcorpus.inc.php"
    ["line"]=>
    int(1556)
    ["function"]=>
    string(14) "do_mysql_query"
    ["args"]=>
    array(1) {
      [0]=>
      &string(71) "SELECT count(*), sum(words) FROM
text_metadata_for_test_coscach WHERE  "
    }
  }
  [3]=>
  array(7) {
    ["file"]=>
    string(42) "/var/www/html/cqpweb/lib/subcorpus.inc.php"
    ["line"]=>
    int(1214)
    ["function"]=>
    string(15) "initialise_size"
    ["class"]=>
    string(11) "Restriction"
    ["object"]=>
    object(Restriction)#14 (15) {
      ["serialised":"Restriction":private]=>
      string(26) "$^text|location~concepcion"
      ["parsed_conditions":"Restriction":private]=>
      array(1) {
        ["text"]=>
        array(1) {
          [0]=>
          string(19) "location~concepcion"
        }
      }
      ["stored_text_metadata_where":"Restriction":private]=>
      NULL
      ["stored_idlink_where":"Restriction":private]=>
      NULL
      ["cpos_collection":"Restriction":private]=>
      NULL
      ["corpus":"Restriction":private]=>
      string(12) "test_coscach"
      ["item_type":"Restriction":private]=>
      string(4) "text"
      ["n_items":"Restriction":private]=>
      NULL
      ["n_tokens":"Restriction":private]=>
      NULL
      ["freqtable_record":"Restriction":private]=>
      NULL
      ["hasrun_initialise_text_metadata_where":"Restriction":private]=>
      bool(false)
      ["hasrun_initialise_idlink_where":"Restriction":private]=>
      bool(false)
      ["hasrun_initialise_cpos_collection":"Restriction":private]=>
      bool(false)
      ["hasrun_initialise_size":"Restriction":private]=>
      bool(false)
      ["needs_to_be_added_to_cache":"Restriction":private]=>
      bool(false)
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(0) {
    }
  }
  [4]=>
  array(6) {
    ["file"]=>
    string(42) "/var/www/html/cqpweb/lib/subcorpus.inc.php"
    ["line"]=>
    int(670)
    ["function"]=>
    string(12) "new_from_url"
    ["class"]=>
    string(11) "Restriction"
    ["type"]=>
    string(2) "::"
    ["args"]=>
    array(2) {
      [0]=>
      &string(85)
"theData=gente&qmode=sq_nocase&pp=50&del=begin&t=text|location~concepcion&del=end&uT=y"
      [1]=>
      &bool(true)
    }
  }
  [5]=>
  array(7) {
    ["file"]=>
    string(42) "/var/www/html/cqpweb/lib/subcorpus.inc.php"
    ["line"]=>
    int(589)
    ["function"]=>
    string(14) "parse_from_url"
    ["class"]=>
    string(10) "QueryScope"
    ["object"]=>
    object(QueryScope)#15 (4) {
      ["type"]=>
      int(0)
      ["restriction":"QueryScope":private]=>
      NULL
      ["subcorpus":"QueryScope":private]=>
      NULL
      ["serialised":"QueryScope":private]=>
      string(0) ""
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(2) {
      [0]=>
      &string(89)
"theData=gente&qmode=sq_nocase&pp=50&del=begin&t=text%7Clocation%7Econcepcion&del=end&uT=y"
      [1]=>
      &bool(true)
    }
  }
  [6]=>
  array(6) {
    ["file"]=>
    string(44) "/var/www/html/cqpweb/lib/concordance.inc.php"
    ["line"]=>
    int(156)
    ["function"]=>
    string(12) "new_from_url"
    ["class"]=>
    string(10) "QueryScope"
    ["type"]=>
    string(2) "::"
    ["args"]=>
    array(2) {
      [0]=>
      &string(89)
"theData=gente&qmode=sq_nocase&pp=50&del=begin&t=text%7Clocation%7Econcepcion&del=end&uT=y"
      [1]=>
      &bool(true)
    }
  }
  [7]=>
  array(4) {
    ["file"]=>
    string(40) "/var/www/html/cqpweb/exe/concordance.php"
    ["line"]=>
    int(1)
    ["args"]=>
    array(1) {
      [0]=>
      string(44) "/var/www/html/cqpweb/lib/concordance.inc.php"
    }
    ["function"]=>
    string(7) "require"
  }
}

===== ERROR 2 =====

CQPweb encountered an error and could not continue.
A MySQL query did not run successfully!

Original query: SELECT count(*), sum(words) FROM
text_metadata_for_test_coscach WHERE /* from User: user | Function:
do_append_mysql_comment() | 2016-Jul-26 16:49:17 */

Error # 1064: You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near
'' at line 2


PHP debugging backtrace

array(5) {
  [1]=>
  array(4) {
    ["file"]=>
    string(40) "/var/www/html/cqpweb/lib/library.inc.php"
    ["line"]=>
    int(282)
    ["function"]=>
    string(20) "exiterror_mysqlquery"
    ["args"]=>
    array(3) {
      [0]=>
      &int(1064)
      [1]=>
      &string(146) "You have an error in your SQL syntax; check the manual
that corresponds to your MySQL server version for the right syntax to use
near '' at line 2"
      [2]=>
      &string(156) "SELECT count(*), sum(words) FROM
text_metadata_for_test_coscach WHERE
/* from User: user | Function: do_append_mysql_comment() | 2016-Jul-26
16:49:17 */"
    }
  }
  [2]=>
  array(4) {
    ["file"]=>
    string(42) "/var/www/html/cqpweb/lib/subcorpus.inc.php"
    ["line"]=>
    int(1556)
    ["function"]=>
    string(14) "do_mysql_query"
    ["args"]=>
    array(1) {
      [0]=>
      &string(71) "SELECT count(*), sum(words) FROM
text_metadata_for_test_coscach WHERE  "
    }
  }
  [3]=>
  array(7) {
    ["file"]=>
    string(42) "/var/www/html/cqpweb/lib/subcorpus.inc.php"
    ["line"]=>
    int(1214)
    ["function"]=>
    string(15) "initialise_size"
    ["class"]=>
    string(11) "Restriction"
    ["object"]=>
    object(Restriction)#16 (15) {
      ["serialised":"Restriction":private]=>
      string(26) "$^text|location~concepcion"
      ["parsed_conditions":"Restriction":private]=>
      array(1) {
        ["text"]=>
        array(1) {
          [0]=>
          string(19) "location~concepcion"
        }
      }
      ["stored_text_metadata_where":"Restriction":private]=>
      NULL
      ["stored_idlink_where":"Restriction":private]=>
      NULL
      ["cpos_collection":"Restriction":private]=>
      NULL
      ["corpus":"Restriction":private]=>
      string(12) "test_coscach"
      ["item_type":"Restriction":private]=>
      string(4) "text"
      ["n_items":"Restriction":private]=>
      NULL
      ["n_tokens":"Restriction":private]=>
      NULL
      ["freqtable_record":"Restriction":private]=>
      NULL
      ["hasrun_initialise_text_metadata_where":"Restriction":private]=>
      bool(false)
      ["hasrun_initialise_idlink_where":"Restriction":private]=>
      bool(false)
      ["hasrun_initialise_cpos_collection":"Restriction":private]=>
      bool(false)
      ["hasrun_initialise_size":"Restriction":private]=>
      bool(false)
      ["needs_to_be_added_to_cache":"Restriction":private]=>
      bool(false)
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(0) {
    }
  }
  [4]=>
  array(6) {
    ["file"]=>
    string(48) "/var/www/html/cqpweb/lib/subcorpus-admin.inc.php"
    ["line"]=>
    int(128)
    ["function"]=>
    string(12) "new_from_url"
    ["class"]=>
    string(11) "Restriction"
    ["type"]=>
    string(2) "::"
    ["args"]=>
    array(1) {
      [0]=>
      &string(178)
"subcorpusNewName=concepcion&action=Create+subcorpus+from+selected+categories&scriptMode=create_from_metadata&thisQ=subcorpus&del=begin&t=text%7Clocation%7Econcepcion&del=end&uT=y"
    }
  }
  [5]=>
  array(4) {
    ["file"]=>
    string(44) "/var/www/html/cqpweb/exe/subcorpus-admin.php"
    ["line"]=>
    int(1)
    ["args"]=>
    array(1) {
      [0]=>
      string(48) "/var/www/html/cqpweb/lib/subcorpus-admin.inc.php"
    }
    ["function"]=>
    string(7) "require"
  }
}

*From:* cwb-bounces at liste.sslmit.unibo.it [mailto:
> cwb-bounces at liste.sslmit.unibo.it] *On Behalf Of *Scott Sadowsky
> *Sent:* 26 July 2016 17:12
>
> *To:* Open source development of the Corpus WorkBench
> *Cc:* Open source development of the Corpus WorkBench
> *Subject:* Re: [CWB] WebInABox: Can't import existing corpora from host
>
>
>
> On Tue, Jul 26, 2016 at 7:25 AM, Hardie, Andrew <a.hardie at lancaster.ac.uk>
> wrote:
>
>
>
> Hi Andrew,
>
> I have had a dig, and found the bug (it was a regex glitch parsing the
> inserted registry file). Update the code to rev 880 and you should find
> that the system will obediently detect your s-attributes. (You will still,
> naturally, need to go through the first step that IO mentioned,  of making
> sure all data from earlier passes is properly scrubbed.)
>
> Eureka - with this new rev CQPweb now imports my XML metadata! Thanks so
> much for hunting this down and fixing it!
>
>
>
> I've now done the following:
>
>
>
> 1. I went through the "Manage Corpus XML" page and set descriptions and
> data types, defining the attributes I want to be able to search on in
> queries, subqueries, sub-corpora, etc. to "classification" (e.g. speaker
> sex and location).
>
>
>
> 2. I went through the "Manage Annotation" page and linked the "Annotation
> setup for CEQL queries" fields to the various annotation data in my corpus.
>
>
>
> 3. On the "Manage frequency lists" page I (re)generated everything (I've
> attached the metadata table from mysql below).
>
>
>
> I can now perform queries, and my metadata is recognized. But how do I
> restrict searches using the s-attributes (say, speaker sex)? When I do a
> query and then select "Distribution", for example, I'm told that "This
> corpus has no text-classification metadata, so the distribution cannot be
> shown".
>
>
>
> Thanks!
>
> Scott
>
>
>
>
>
> mysql> select * from xml_metadata;
>
>
> +----+--------------+-----------------+------------+-----------------------------------+----------+
>
> | id | corpus       | handle          | att_family | description
>             | datatype |
>
>
> +----+--------------+-----------------+------------+-----------------------------------+----------+
>
> |  1 | bncsampler   | s               | s          | s
>             |        0 |
>
> |  2 | bncsampler   | text            | text       | text
>              |        0 |
>
> |  3 | bncsampler   | text_id         | text       | text_id
>             |        3 |
>
> |  4 | lcmc         | s               | s          | s
>             |        0 |
>
> |  5 | lcmc         | text            | text       | text
>              |        0 |
>
> |  6 | lcmc         | text_id         | text       | text_id
>             |        3 |
>
> |  7 | test_coscach | s               | s          | Sentence
>              |        0 |
>
> |  8 | test_coscach | text            | text       | Text
>              |        0 |
>
> |  9 | test_coscach | text_id         | text       | Unique Text ID
>              |        3 |
>
> | 10 | test_coscach | text_corpus     | text       | Corpus name
>             |        2 |
>
> | 11 | test_coscach | text_tagger     | text       | Corpus tagger
>             |        2 |
>
> | 12 | test_coscach | text_language   | text       | Text language
>             |        1 |
>
> | 13 | test_coscach | text_channel    | text       | Spoken or written?
>              |        2 |
>
> | 14 | test_coscach | text_instrument | text       | Elicitation
> instrument            |        1 |
>
> | 15 | test_coscach | text_lingualism | text       | Speaker monolingual
> or bilingual? |        1 |
>
> | 16 | test_coscach | text_location   | text       | Speaker location
>              |        1 |
>
> | 17 | test_coscach | text_sex        | text       | Speaker sex
>             |        1 |
>
> | 18 | test_coscach | text_generation | text       | Speaker generation
>              |        1 |
>
> | 19 | test_coscach | text_sel        | text       | Speaker SEL
>             |        1 |
>
>
> +----+--------------+-----------------+------------+-----------------------------------+----------+
>
> 19 rows in set (0.00 sec)
>
>
>
> mysql>
>
>
>
>
>
> *From:* Hardie, Andrew
> *Sent:* 25 July 2016 23:48
> *To:* Open source development of the Corpus WorkBench
> *Subject:* RE: [CWB] WebInABox: Can't import existing corpora from host
>
>
>
> OK, 2 things:
>
>
>
> First – the result of the MySQL query shows that none of the XML of your
> corpus has been detected.
>
>
>
> Second – the other error you report is clearly referring to your earlier
> index data. The check on text ID validity is done at point of extraction
> *from* the index *to *CQPweb’s internal data structures. So, it is
> reading the index and getting bad values. This implies that your earelier
> index files still exist and are being read by CQPweb.
>
>
>
> So, the overall picture would seem to be that you have data hanging around
> from previous incarnations of the corpus, and your reinstallation did not
> work properly. Your best bet might be to make doubly sure everything is
> wiped from that corpus, then start over again. This will probably not fix
> all the problems but it *should* make the issues that remain clearer.
>
>
>
> best
>
>
>
> Andrew.
>
>
>
> *From:* cwb-bounces at liste.sslmit.unibo.it [mailto:
> cwb-bounces at liste.sslmit.unibo.it] *On Behalf Of *Scott Sadowsky
> *Sent:* 25 July 2016 17:15
>
> *To:* Open source development of the Corpus WorkBench
> *Cc:* Open source development of the Corpus WorkBench
> *Subject:* Re: [CWB] WebInABox: Can't import existing corpora from host
>
>
>
> On Mon, Jul 25, 2016 at 5:48 AM, Hardie, Andrew <a.hardie at lancaster.ac.uk>
> wrote:
>
>
>
> Try running
>
>
>
>           select * from xml_metadata;
>
>
>
> in the MySQL command line client, and see what you get.
>
>
>
> This is what I get:
>
>
>
> $ mysql -u root -p cqpweb
>
> Enter password:
>
> Reading table information for completion of table and column names
>
> [...]
>
> mysql> select * from xml_metadata;
>
> +----+------------+---------+------------+-------------+----------+
>
> | id | corpus     | handle  | att_family | description | datatype |
>
> +----+------------+---------+------------+-------------+----------+
>
> |  1 | bncsampler | s       | s          | s           |        0 |
>
> |  2 | bncsampler | text    | text       | text        |        0 |
>
> |  3 | bncsampler | text_id | text       | text_id     |        3 |
>
> |  4 | lcmc       | s       | s          | s           |        0 |
>
> |  5 | lcmc       | text    | text       | text        |        0 |
>
> |  6 | lcmc       | text_id | text       | text_id     |        3 |
>
> +----+------------+---------+------------+-------------+----------+
>
> 6 rows in set (0.00 sec)
>
>
>
> mysql>
>
>
>
>
>
> I have noted something anomalous on another front which may be relevant.
> When I go to the "Manage Metadata" page of the corpus I'm trying to get set
> up, and hit the "Create minimalist metadata table" button, I get an error
> which has nothing to do with my current corpus:
>
>
>
> The data source you specified for the text metadata contains
> badly-formatted text ID codes, as follows: <strong> '<no annotation>';
> 'CCN-F2-01_Ca_St.ortografica.txt'; 'CCN-F2-02_D_StB.ortografica.txt';
> 'CCN-F2-03_Ca_St.ortografica.txt';
> 'CCN-F2-04_Cb_St.ortografica.txt';[...]</strong> (text ids can only contain
> unaccented letters, numbers, and underscore).
>
>
>
> None of these values are present in my current corpus, though they *were*
> in an earlier version, However, I removed them from the tagged texts after
> you explained that these values had to be handles. Here's what my metadata
> currently looks like:
>
>
>
> <text id="CCN_F2_27_B" corpus="coscach" tagger="freeling_xml"
> language="spanish" channel="oral" instrument="interview"
> lingualism="monolingual" location="concepcion" sex="f" generation="G2"
> sel="B">
>
>
>
> So values like 'CCN-F2-01_Ca_St.ortografica.txt' are not in my corpus any
> more (and I recompiled it from these files, of course), but they seem to be
> cached somewhere by CQPweb, and they are not getting updated by newer
> corpora I try to import. (Note that I've used different names, e.g.
> test_corpus, test_corpus_two, in order to try to get around this, but it
> hasn't worked).
>
>
>
> Cheers,
> Scott
>
>
>
>
>
>
>
> best
>
>
>
> Andrew.
>
>
>
>
>
>
>
> *From:* cwb-bounces at liste.sslmit.unibo.it [mailto:
> cwb-bounces at liste.sslmit.unibo.it] *On Behalf Of *Scott Sadowsky
> *Sent:* 24 July 2016 17:17
> *To:* Open source development of the Corpus WorkBench
> *Cc:* Open source development of the Corpus WorkBench
> *Subject:* Re: [CWB] WebInABox: Can't import existing corpora from host
>
>
>
> On Sun, Jul 24, 2016 at 11:29 AM, Hardie, Andrew <a.hardie at lancaster.ac.uk>
> wrote:
>
>
>
> First point – your text ID codes won’t work, they need to be *handles*,
> i.e. just ASCII letters, numbers, and underscore – no hyphens/full stops.
>
>
>
> Now corrected!
>
>
>
> Second point – the various s-attributes text_corpus , text_tagger etc.
> need (a) to exist in the registry – did your correction fix this? (b)
> CQPweb needs to have logged their existence – if it’s saying “No XML
> annotations found” that suggests it hasn’t, which could be a consequence of
> (a), or could be a bug.
>
>
>
> Unless I'm mistaken about what attributes are what, they are indeed in the
> registry. I've pasted it at the end of this e-mail, along with a single
> tagged source text sentence.
>
>
>
> There was in fact a bug with s-attributes in the registry failing to be
> detected which I fixed a few months back: I cannot recall if that was
> before or after the version of the code in the VM image. If you want to
> rule this out, connect the VM’s networking, upgrade CQPweb to the latest
> version from SVN (don’t forget to do the database upgrade!), and try again:
> if that fixes it, it was the old bug.
>
>
>
> I've been using revision 879 (3.2.20) the whole time, so it shouldn't be
> the old bug.
>
>
>
>
>
> Once CQPweb is aware of your XML attributes you should be able to use them
> to derive text metadata.
>
>
>
> Thanks for your patience!
>
>
>
> Cheers,
>
> Scott
>
>
>
>
>
> <text id="CCN_F2_25_Ca" corpus="test_two" tagger="freeling_xml"
> language="spanish" channel="oral" instrument="interview"
> lingualism="monolingual" location="concepcion" sex="f" generation="G2"
> sel="Ca">
>
> <s>
>
> ¿       ¿       Fia     Fia     punctuation     questionmark
>
> todavía todavía RG      RG      adverb  general
>
> está    estar   VAIP3S0 VAI     verb    auxiliary
>
> grabando        grabar  VMG0000 VMG     verb    main
>
> ?       ?       Fit     Fit     punctuation     questionmark
>
> </s>
>
> </text>
>
>
>
>
>
>
>
> ##
>
> ## registry entry for corpus TEST_TWO
>
> ##
>
>
>
> # long descriptive name for the corpus
>
> NAME ""
>
> # corpus ID (must be lowercase in registry!)
>
> ID   test_two
>
> # path to binary data files
>
> HOME /var/cqpweb/index/test_two
>
> # optional info file (displayed by "info;" command in CQP)
>
> INFO /var/cqpweb/index/test_two/.info
>
>
>
> # corpus properties provide additional information about the corpus:
>
> ##:: charset  = "utf8" # character encoding of corpus data
>
> ##:: language = "es"     # insert ISO code for language (de, en, fr, ...)
>
>
>
>
>
> ##
>
> ## p-attributes (token annotations)
>
> ##
>
>
>
> ATTRIBUTE word
>
> ATTRIBUTE lemma
>
> ATTRIBUTE tag
>
> ATTRIBUTE ctag
>
> ATTRIBUTE pos
>
> ATTRIBUTE type
>
>
>
>
>
> ##
>
> ## s-attributes (structural markup)
>
> ##
>
>
>
> # <s> ... </s>
>
> # (no recursive embedding allowed)
>
> STRUCTURE s
>
>
>
> # <text id=".." corpus=".." tagger=".." file=".." language=".."
> channel=".." instrument=".." lingualism=".." location=".." sex=".."
> generation=".." sel=".."> ... </text>
>
> # (no recursive embedding allowed)
>
> STRUCTURE text
>
> STRUCTURE text_id              # [annotations]
>
> STRUCTURE text_corpus          # [annotations]
>
> STRUCTURE text_tagger          # [annotations]
>
> STRUCTURE text_file            # [annotations]
>
> STRUCTURE text_language        # [annotations]
>
> STRUCTURE text_channel         # [annotations]
>
> STRUCTURE text_instrument      # [annotations]
>
> STRUCTURE text_lingualism      # [annotations]
>
> STRUCTURE text_location        # [annotations]
>
> STRUCTURE text_sex             # [annotations]
>
> STRUCTURE text_generation      # [annotations]
>
> STRUCTURE text_sel             # [annotations]
>
>
>
>
>
> # Yours sincerely, the Encode tool.
>
>
>
>
>
>
>
> *From:* cwb-bounces at liste.sslmit.unibo.it [mailto:
> cwb-bounces at liste.sslmit.unibo.it] *On Behalf Of *Scott Sadowsky
> *Sent:* 24 July 2016 15:52
> *To:* CWBdev Mailing List
>
>
> *Subject:* [CWB] WebInABox: Can't import existing corpora from host
>
>
>
> On Sun, Jul 24, 2016 at 10:19 AM, Hardie, Andrew <a.hardie at lancaster.ac.uk>
> wrote:
>
>
>
> CQPweb requires all corpora to have at least one <text> element, and every
> text element has to have an id i.e. everything within the corpus has to be
> contained within a sequence of one or more
>
>
>
> <text id=”somethinghere”>
>
>>
> </text>
>
>
>
> Thanks, Andrew. It turns out the problem was that I had been using the
> name "id" instead of "text" for the element. Now that I've changed that, I
> was able to successfully create the corpus in CQPweb.
>
>
>
> My source files have quite a bit of metadata, which I've encoded as
> follows:
>
>
>
> <text id="CCN-F2-02_D_StB.ortografica.txt" corpus="test" tagger="freeling-xml"
> language="spanish" location="concepcion" sex="f">
>
> ...
>
> </text>
>
>
> I'm now at the CQPweb "Design and insert a text-metadata table for the
> corpus" page, but it tells me that "No XML annotations found for this
> corpus". Is there something wrong with how I did the encoding above? I can
> use all of these XML elements in cqp searches directly, but here they
> aren't recognized.
>
>
>
> (I've checked chapter 6 of the manual, to no avail).
>
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at liste.sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20160726/5ab18f16/attachment-0001.html>


More information about the CWB mailing list