[CWB] Where to start looking for the problem?

Hardie, Andrew a.hardie at lancaster.ac.uk
Wed Apr 20 16:47:49 CEST 2022


Hi Andrés,

Wow, what a pile of issues!

The key point is this one:

          There is a corpus that I've been told it was working correctly, but all of a sudden queries started to give back [UNREADABLE]

This can't happen without cause. Something must have changed. The range of other error messages suggests that the setup of the operating system has changed. Perhaps it has been upgraded, or default settings restored. In any case, the issues would seem to be with configuration of either the filesystem, or the MySQL daemon, or both.

          So, my question is where and how should I start to try to find out the cause of this problem?

Turn on showing debug messages in the config file. This should help.

Note that "[UNREADABLE]" is caused by failure to parse concordance output from CQP. So, seeing what is actually being passed back and forth will probably give you a hint.

BTW While  3.2.27 is old, it's not that old relative to the latest in the branch.

best

Andrew.

From: cwb-bounces at sslmit.unibo.it <cwb-bounces at sslmit.unibo.it> On Behalf Of "Andrés Chandía"
Sent: 20 April 2022 14:59
To: Open source development of the Corpus WorkBench <cwb at sslmit.unibo.it>
Subject: [CWB] Where to start looking for the problem?

Hi there,

At a cqpweb installation (CQPweb v3.2.27)

There is a corpus that I've been told it was working correctly, but all of a sudden queries started to give back [UNREADABLE]

Like:

[UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE][UNREADABLE] <https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcorptedig-glif.upf.edu%2Fcqpweb%2Fcoca%2Fcontext.php%3Fbatch%3D0%26qname%3Dh1feli3mdi%26uT%3Dy&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Gz%2BTXbGVXeC4K4iuxGWiuYz%2FeL63Mp1BsUpKhXgO2eY%3D&reserved=0> [UNREADABLE]<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcorptedig-glif.upf.edu%2Fcqpweb%2Fcoca%2Fcontext.php%3Fbatch%3D0%26qname%3Dh1feli3mdi%26uT%3Dy&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Gz%2BTXbGVXeC4K4iuxGWiuYz%2FeL63Mp1BsUpKhXgO2eY%3D&reserved=0> [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE] [UNREADABLE]

So, my question is where and how should I start to try to find out the cause of this problem?

Things I've done/notice:

If I click on the linked results and the switch to the alternative view (pos), I get:

i ge mc fo ) rr , rr21 rr22 , pn1 vvd np1 np1 , cc np1 np1 vvd ii21 ii22 nn1 ii21 ii22 nnt1 cs pphs1 vdd xx vvi to vbi rl to vvi ppge at _appge - appge nn1 vvg dd1 nn1 .

rr pph1 vbdz xx rr at1 nn1 nnt1 .

ccb md - md_nnt1 nn2 vh0 vbn - vh0 vbn jj .

cs ppis1 vbdr np1 , ppis1 vm xx vbi vvg_jj@ nn1 ii dd1 io dd2 nn2 .


If I go to the Admin Control Panel I see:

Corpus: coca | Indexing date: 2019-04-11 15:29:37 | Size Tokens: 0 | Types: 1,274,893 | Texts: 0 | Disk space Indexes: 0.0 MB | Freq tables: 57.4 MB

Frequency list search show results correctly:
1 king<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcorptedig-glif.upf.edu%2Fcqpweb%2Fcoca%2Fconcordance.php%3FtheData%3D%255Bword%253D%2522king%2522%2525c%255D%26qmode%3Dcqp%26uT%3Dy&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Cm8Bvjo9QMzs%2BHcy1lhfwYlEPNi1YQNCDy9sIy9ALxI%3D&reserved=0> 11,825

and all derived from king...

A word lookup will fail giving this output: A MySQL query did not run successfully!

Original query: select count(concat(node,'_',tagnode)) as tokens, count(distinct(concat(node,'_',tagnode))) as types from db_sort_h1fel9jyg4 /* from User: admin | Function: require() | 2022-Apr-20 13:37:27 */

Error # 1054: Unknown column 'tagnode' in 'field list'

Generate CWB text-position records outputs: A MySQL query did not run successfully!
Original query: insert into `___temp_cqp_text_positions_for_coca` (text_id, cqp_begin, cqp_end) VALUES ('4122770', 0, 5307), .... etc. a lot of numbers...
Error # 1062: Duplicate entry '4166449' for key 'PRIMARY'

Update Word and file counts outputs "the connections has expired", on retrying page keeps loading until is done

Recreate CWB Frequency table outputs: "the connections has expired" on retrying I get: CQPweb could not create a directory for the frequency index. Check filesystem permissions!

I've checked the directories and all of them are owned by www-data


Recreate Frequency tables outpus: "the connections has expired" on retry finally outpus;


A MySQL query did not run successfully!

Original query: CREATE TABLE __tempfreq_coca ( freq int(11) unsigned default NULL, word varchar(255) NOT NULL, lemma varchar(255) NOT NULL, pos varchar(255) NOT NULL, key (word), key (lemma), key (pos) ) CHARACTER SET utf8 COLLATE utf8_general_ci /* from User: admin | Function: corpus_make_freqtables() | 2022-Apr-20 13:53:57 */

Error # 1050: Table '__tempfreq_coca' already exists

I will appreciate any guidance, thanks a lot!!



_______________________
            andrés chandía
[chandia.net]<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.chandia.net%2F&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=AJg06XC5QspRKWRZgaKi4ozZjfl81Wf3yFgqIC4ISB4%3D&reserved=0>[http://mail.chandia.net/images/ico_tw.png]<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fchandianet&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aS%2FFt7woqrhYu2Fpu1m99U7GyKUZYfykunpNP5E2f6Y%3D&reserved=0>
Düngupeyem<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fchandia.net%2Fdungupeyem&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lNXZaBYLRwkIuGtezeNTWxfvSmK2I1Dl%2BO%2Fwng8R%2B%2Fw%3D&reserved=0> | IECMap<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fchandia.net%2Fiecmap&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a8hADAYdOQhEBwAS9FfxCcqKtPB9SktZ7ckfI186lCA%3D&reserved=0> | ISECMap<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fchandia.net%2Fisecmap&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vv9UO2EfUuPJfpBaE6uuFrVuLG%2BUXtvjnVAXmyqyWOk%3D&reserved=0> | NMT<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fchandia.net%2Fnmt-norwirin-mapudungun-trapumwe&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=SysQSToEsqmW7BjPel68l0vD2hBr4xPc8%2BJJ%2Fryadbo%3D&reserved=0> | Corlexim<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcorlexim.cl%2F&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9pifsSmI2MpgFS%2F9%2F5gX4ZvWA3qOTvFxZv3V4IJm%2B3k%3D&reserved=0>

Desarrollador de:
Parles.upf<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fparles.upf.edu%2F&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=R8HlJVd5GHVsGjFtGMBp18czJlXo3SBxg7LEbvdHOQQ%3D&reserved=0> | IWCH<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiwch.upf.edu%2F&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=AONFB2FoJCNeqi72WqMi01E2QsXvTqc4ySWoBsG077c%3D&reserved=0> | Amind terapia<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Famindterapia.com%2F&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5FygWzN0zggop%2Fl1tCK3VrMLAWbdlgcp6LO7Hnx%2FZOw%3D&reserved=0> | Nocando<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fparles.upf.edu%2Fllocs%2Fnocando&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rbtmoherJE0ww9kKzb8En%2BQKFe4tOnVmc9nFFt6BTDM%3D&reserved=0> | IAC<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiac.upf.edu%2F&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089742393%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=cUhjtD4UEdmBJJGhCxmUh%2BQCq6Y82eJpdlfG2rrcxFo%3D&reserved=0> | CddZ<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiac.upf.edu%2Fcddz&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089898601%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jAP%2Ft6kK8UBD1V%2BqE4zLN9PNyA7%2FmvdmyfeTH8v2rDY%3D&reserved=0> | ISAC<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiac.upf.edu%2Fisac&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089898601%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QBv32euV%2Btk37zv%2BPu5tdXF%2BPlZMTfL9aOqOPy6ry5s%3D&reserved=0> | CatCg<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcatcg.upf.edu%2F&data=05%7C01%7Chardiea%40live.lancs.ac.uk%7Cbed4548d504846ec1b3708da22d85d65%7C9c9bcd11977a4e9ca9a0bc734090164a%7C0%7C0%7C637860611089898601%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hYB8LyGvwVRrJTwZM5WFIZf4SchBjoFXzuWwAsJu%2FBc%3D&reserved=0>
P No imprima innecesariamente. ¡Cuide el medio ambiente!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20220420/37f20307/attachment-0001.html>


More information about the CWB mailing list