[CWB] CWB Digest, Vol 84, Issue 20

Thu Jan 30 15:57:06 CET 2014

As always you're right:

<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.444444">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.461538">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">

So I have to look the way to reenumerate ids.....

thanks

On Thu, January 30, 2014 15:42, Hardie, Andrew wrote:
 <style type="text/css">-></style>

That
was a query to find the first word of a text. If it finds more than one hit, as in your
screenshot, then something is wrong in the underlying data e.g.  you may have multiple texts
with the same ID, or, perhaps,  elements may not be closed properly.

best

Andrew.

From:
cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of
Andres Chandia
 Sent: 30 January 2014 11:25

To: Open source development of the Corpus WorkBench

Subject: Re: [CWB] CWB Digest, Vol 84, Issue 20

I attach the results:

 On Thu, January 30, 2014 10:58,
Hardie, Andrew wrote:

>>
Can I populate this table some other way?
No.
You need to find out what is going wrong when you do it this way, or any
other way would fail as well.
Can
you check that the necessary text ids are properly encoded, i.e. do a CQP-syntax query for

[]
And
see if it returns the first word of text 10038, as expected.
best
Andrew.
From: 
cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of
Andres Chandia
 Sent: 29 January
2014 17:22
 To: Open source
development of the Corpus WorkBench
 Subject: Re:
[CWB] CWB Digest, Vol 84, Issue 20
Hi there, 

 I retake this issue because is something similar what is happening to me now, the corpus is
bigger though, 18Gb.
 when I try to populate the text metadata table with begin/end
offset positions, I get no results, I mean I get the table created but all the velues are like
this: 
 | text_id | words | cqp_begin | cqp_end |
 | 10038 | 0 | 0 | 0 |
 |
6570 | 0 | 0 | 0 |
 | 4099 | 0 | 0 | 0 |
 | 9887 | 0 | 0 | 0 |
 | 819 | 0 | 0 |
0 |
 | 4910 | 0 | 0 | 0 |
 | 7669 | 0 | 0 | 0 |
 | 2889 | 0 | 0 | 0 |
 |
9627 | 0 | 0 | 0 |
 | 5265 | 0 | 0 | 0 |
 | 1076 | 0 | 0 | 0 |
 | 6196 | 0 | 0
| 0 |
 | 4213 | 0 | 0 | 0 |
 | 1212 | 0 | 0 | 0 |
 | 4688 | 0 | 0 | 0 |

 so the interface for this corpus always says : The text metadata table has not
yet been populated with begin/end offset positions.
 I have put the interface in
debug mode and here you have what I get: sdewac-debug.7z

 Can I populate this
table some other way?

 _______________________

andrÃ©s chandÃa

 administrador de
 parles.upf.edu
 psicoaching.net
 mapuche koyaktu
 ong mapuche koyaktu
 P No imprima innecesariamente. Â¡Cuide el medio
ambiente!

_______________________
Â Â Â Â Â Â Â Â Â Â Â Â andrÃ©s
chandÃa

administrador de
parles.upf.edu
psicoaching.net
mapuche koyaktu
ong mapuche koyaktu
P No imprima innecesariamente. Â¡Cuide el medio ambiente!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20140130/ab0ab47c/attachment-0001.html>