<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Aptos;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        font-size:12.0pt;
        font-family:"Aptos",sans-serif;}
p.errormessage, li.errormessage, div.errormessage
        {mso-style-name:errormessage;
        mso-margin-top-alt:auto;
        margin-right:0cm;
        mso-margin-bottom-alt:auto;
        margin-left:0cm;
        font-size:12.0pt;
        font-family:"Aptos",sans-serif;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Verdana",sans-serif;
        color:#156082;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;
        mso-ligatures:none;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">The CQP parser objects to IDs that start with a digit. Thus the problem (“</span><1984_1> is not a valid corpus name<span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">”)
that causes it to yield an error as soon as it starts up and reads your registry.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">The immediate solution for you: manually delete the file
</span>/var/cqpweb/registry/1984_1<span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">And reindex with a corpus ID that starts in a letter.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Long-term, this is a bug, because (a) CQPweb (neither the UI nor the backend) doesn’t check for IDs that start in a digit before feeding
them to cwb-encode; and (b) cwb-encode doesn’t check the registry filename, which determines the corpus ID, to make sure it starts in a letter. I’ll see about fixing those.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Best<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US">Andrew.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#156082;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm;font-size:pt">
<p class="MsoNormal"><b><span style="font-family:"Calibri",sans-serif">From:</span></b><span style="font-family:"Calibri",sans-serif"> cwb-bounces@sslmit.unibo.it <cwb-bounces@sslmit.unibo.it>
<b>On Behalf Of </b>Graham Ranger -- UAPV<br>
<b>Sent:</b> 14 May 2025 20:09<br>
<b>To:</b> Open source development of the Corpus WorkBench <cwb@sslmit.unibo.it><br>
<b>Subject:</b> [CWB] Installation of a corpus and subsequent problems...<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">Hello to everyone,<br>
The title says it all, or almost.<br>
<br>
1) I attempted to install a corpus with a set of xml tags, etc. but ran into an error;<br>
2) I then attempted to install a mini-corpus, in an effort at debugging, and ran into the same error (something about extra material after xml tags, repeated for every line of the corpus) -- I can't be more precise for reasons which will soon become clear;<br>
3) I then attempted to delete the corpus which, although not created, was occupying a registry entry, and now have another error message: "**** CQP ERROR **** cl_new_corpus: <1984_1> is not a valid corpus name REGISTRY ERROR (/var/cqpweb/registry/1984_1): syntax
error REGISTRY ERROR (/var/cqpweb/registry/1984_1): Error parsing the main Registry structure. CQPweb encountered an error and could not continue."<br>
4) I am now unable to execute any queries or do anything much with cqpweb... On executing a query, for example, I get this error message:<o:p></o:p></p>
<p class="errormessage">CQP reports an error! The CQP program sent back these error messages:<o:p></o:p></p>
<p class="errormessage">**** CQP ERROR ****<o:p></o:p></p>
<p class="errormessage">CQP Error:<o:p></o:p></p>
<p class="errormessage">No corpus activated<o:p></o:p></p>
<p class="errormessage">CQP Error:<o:p></o:p></p>
<p class="errormessage">CQP Syntax Error: syntax error<o:p></o:p></p>
<p class="errormessage">[r] Registry <--<o:p></o:p></p>
<p class="errormessage">Ignoring subsequent input until next ';'...<o:p></o:p></p>
<p class="errormessage">I'm going to ask for the server to be restored to a previous state, which should provide a fix, but won't get me any further with installing the corpus I wished to set up. If there's a simpler way to repair the registry entries, I'd
be interested.<o:p></o:p></p>
<p class="errormessage">The "toy corpus" which managed to break the server was as follows:<o:p></o:p></p>
<p class="errormessage"><text id="1984_1"><br>
<title>Nineteen eighty-four</title><br>
<div1 type="part" n="1"><br>
<head>PART 1</head><br>
<div2 type="chapter" n="1"><br>
<head>1</head><br>
<p>It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith, his chin nuzzled into his breast in an effort to escape the vile wind, slipped quickly through the glass doors of Victory Mansions, though not quickly enough to prevent
a swirl of gritty dust from entering along with him.</p><br>
</div2><br>
</div1><br>
</text><o:p></o:p></p>
<p class="errormessage">Many thanks in advance for any help with this!<br>
Best regards,<br>
Graham.<o:p></o:p></p>
</div>
</div>
</body>
</html>