[CWB] CWB Digest, Vol 84, Issue 20

Andres Chandia andres at chandia.net
Thu Jan 30 15:57:06 CET 2014



As always you're right:

<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.444444">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.461538">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">
<text id="10038" year=""
url_source="http://www.vanity-rechner.de" error="0.000000">


So I have to look the way to reenumerate ids.....

thanks



On Thu, January 30, 2014 15:42, Hardie, Andrew wrote:
 <style type="text/css">-></style>


That
was a query to find the first word of a text. If it finds more than one hit, as in your
screenshot, then something is wrong in the underlying data e.g.  you may have multiple texts
with the same ID, or, perhaps,  elements may not be closed properly.


best


Andrew.


From:
cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of
Andres Chandia
 Sent: 30 January 2014 11:25

To: Open source development of the Corpus WorkBench

Subject: Re: [CWB] CWB Digest, Vol 84, Issue 20
 
I attach the results:
 
 
 On Thu, January 30, 2014 10:58,
Hardie, Andrew wrote:

 

 
>>
Can I populate this table some other way?
No.
You need to find out what is going wrong when you do it this way, or any
other way would fail as well.
Can
you check that the necessary text ids are properly encoded, i.e. do a CQP-syntax query for

[]
And
see if it returns the first word of text 10038, as expected.
best
Andrew.
From: 
cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of
Andres Chandia
 Sent: 29 January
2014 17:22
 To: Open source
development of the Corpus WorkBench
 Subject: Re:
[CWB] CWB Digest, Vol 84, Issue 20
Hi there, 
 
 I retake this issue because is something similar what is happening to me now, the corpus is
bigger though, 18Gb.
 when I try to populate the text metadata table with begin/end
offset positions, I get no results, I mean I get the table created but all the velues are like
this: 
 | text_id | words | cqp_begin | cqp_end |
 | 10038 | 0 | 0 | 0 |
 |
6570 | 0 | 0 | 0 |
 | 4099 | 0 | 0 | 0 |
 | 9887 | 0 | 0 | 0 |
 | 819 | 0 | 0 |
0 |
 | 4910 | 0 | 0 | 0 |
 | 7669 | 0 | 0 | 0 |
 | 2889 | 0 | 0 | 0 |
 |
9627 | 0 | 0 | 0 |
 | 5265 | 0 | 0 | 0 |
 | 1076 | 0 | 0 | 0 |
 | 6196 | 0 | 0
| 0 |
 | 4213 | 0 | 0 | 0 |
 | 1212 | 0 | 0 | 0 |
 | 4688 | 0 | 0 | 0 |


 so the interface for this corpus always says : The text metadata table has not
yet been populated with begin/end offset positions.
 I have put the interface in
debug mode and here you have what I get: sdewac-debug.7z
 
 Can I populate this
table some other way?

 


 
 
 
 _______________________
            
andrés chandía
 
 administrador de
 parles.upf.edu
 psicoaching.net
 mapuche koyaktu
 ong mapuche koyaktu
 P No imprima innecesariamente. ¡Cuide el medio
ambiente!


 


_______________________
            andrés
chandía

administrador de
parles.upf.edu
psicoaching.net
mapuche koyaktu
ong mapuche koyaktu
P No imprima innecesariamente. ¡Cuide el medio ambiente!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20140130/ab0ab47c/attachment-0001.html>


More information about the CWB mailing list