[CWB] Can't generate text-by-text freq lists?

Arthur Wang arthur0421 at gmail.com
Sun Jul 9 23:02:29 CEST 2017


Hi Andrew,

Mine is a learner corpus. If I click "Manage text metadata", I see two 
file handles "major" and "year", both are of the Classification datatype.

"Manage text categories": I see the usual forms asking me to insert or 
update text category descriptions... By the way, both the classification 
schemes "major" and "year" have categories that contain only digits, no 
alphabetical letters (is this a problem?).

My checkout is the trunk...
svn co http://svn.code.sf.net/p/cwb/code/gui/cqpweb/trunk cqpweb

Best,
Jiayue

On 09/07/17 21:36, Hardie, Andrew wrote:
> Hi Jiayue,
> 
> On some further thought, looking back at your original report, it sounds as if the frequency-table setup is not actually the problem. It's to do with the distribution function and the metadata setup, I think.
> 
> Can you check the following things.
> 
> - What checkout is your code? Especially, file distribution.inc.php - this is currently broken, if you have anything later than commit # 924 ...
> 
> - What appears when you go to the corpus menu and click "Manage text metadata" / "Manage text categories"?
> 
> (Especially - the datatypes in the former.)
> 
> best
> 
> Andrew.
> 
> -----Original Message-----
> From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Arthur Wang
> Sent: 08 July 2017 14:10
> To: Open source development of the Corpus WorkBench
> Subject: Re: [CWB] Can't generate text-by-text freq lists?
> 
> Hi Andrew
> 
> Thanks for the reply. But none of these are missing. My corpus is called
> "gxun_grad" (Tree Tagger tagged), and in MySQL I have all the following
> tables:
> 
> text_metadata_for_gxun_grad
> freq_text_index_gxun_grad
> freq_corpus_gxun_grad_lemma
> freq_corpus_gxun_grad_pos
> freq_corpus_gxun_grad_word
> 
> The CWB folders are in my home folder. In "index" there are:
> 
> gxun_grad
> gxun_grad__freq
> 
> In "registry" there are:
> 
> gxun_grad
> gxun_grad__freq
> 
> I installed the corpus quite a few times but the problems remain. What
> else should I look to?
> 
> Best
> Jiayue
> 
> On 08/07/17 12:26, Hardie, Andrew wrote:
>> I suggest you check in MySQL which tables actually exist.
>>
>> You should have the following tables :
>>
>> text_metadata_for_CORPUS
>> freq_text_index_CORPUS
>> freq_corpus_CORPUS_word
>>        .... plus one more like the above for every additional p-attribute.
>>
>> You should also have a CWB corpus called "__CORPUS" in your index data directory and a corresponding registry file in the CQPweb registry directory.
>>
>> If you can identify which of these pieces of data is missing, it will be easier to identify what has gone wrong.
>>
>> best
>>
>> Andrew.
>>
>> -----Original Message-----
>> From: cwb-bounces at sslmit.unibo.it [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Arthur Wang
>> Sent: 07 July 2017 09:26
>> To: Open source development of the Corpus WorkBench
>> Subject: [CWB] Can't generate text-by-text freq lists?
>>
>> Hi,
>>
>> These days I installed a 1 million word corpus in CQPweb (v3.2.26) and
>> its metadata (tsv), and then told CQPweb to auto generate the freq
>> lists, everything looked fine.
>>
>> But then I found that the text freq lists were not actually generated -
>> "Distribution" shows zero for "Hits in category", "Dispersion" and
>> "Frequency", and I can't search by category at all. I check my metadata
>> file, it's perfectly ok.
>>
>> Then I tried generating the text/category freq lists manually, no luck
>> either.
>>
>> What are the possible reasons for text freq lists to fail to be
>> generated? Thanks for any clue.
>>
>> Jiayue
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>>
> 

-- 
Jiayue Wang
College of Foreign Studies
Guangxi University for Nationalities
Nanning, China 530006


More information about the CWB mailing list