<div dir="ltr"><div dir="ltr">On Sat, May 25, 2019 at 2:20 PM Hardie, Andrew <<a href="mailto:a.hardie@lancaster.ac.uk">a.hardie@lancaster.ac.uk</a>> wrote:<br></div><div dir="ltr"><br></div><div>Hi Andrew,</div><div><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-GB">
<div class="gmail-m_-8299871906138635900WordSection1">
<p class="MsoNormal"><span style="color:rgb(31,73,125);font-family:Verdana,sans-serif;font-size:10pt">One possibility is that the wrong charset/collation is being activated for the frequency tables. Could you check this?</span><br></p>
<p class="MsoNormal"><span style="color:rgb(31,73,125);font-family:Verdana,sans-serif;font-size:10pt">If you run </span><span style="color:rgb(31,73,125);font-family:Verdana,sans-serif;font-size:10pt"> create table freq_corpus_</span><b style="color:rgb(31,73,125);font-family:Verdana,sans-serif;font-size:10pt">nameofyrcorpus</b><span style="color:rgb(31,73,125);font-family:Verdana,sans-serif;font-size:10pt">_</span><span style="color:rgb(31,73,125);font-family:Verdana,sans-serif;font-size:10pt">word; </span><span style="color:rgb(31,73,125);font-family:Verdana,sans-serif;font-size:10pt"> the mysql command prompt, then the character set / collation should be stated either for the table as a whole, or for the “item”
column.</span></p></div></div></blockquote><div><br></div><div>That shows "<span style="color:rgb(0,0,0);font-family:monospace">ENGINE=InnoDB DEFAULT CHARSET=utf8</span>". All my source texts are UTF8, and the database is created as that too, by the way.</div><div><br></div><div>Cheers,</div><div>Scott</div></div></div>