<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        mso-ligatures:standardcontextual;
        mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Verdana",sans-serif;
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D">Hi Mike,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D">The job queue is to stop too many user-installed corpora blocking up the system all at once. Admin-installed corpora are assumed to be allowed to take up a full
CPU, ton of RAM, etc. immediately, so they don’t go through that queue.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D">I think that your firewall situation does, however, imply that it would be good to move away from having the server/browser connection maintained throughout indexing.
(Basically to make the process disconnect the browser once indexing starts). I’ll add this as a feature request. However, it’s a big re-engineering of the UI so it won’t happen soon.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D">best<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D">Andrew.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Verdana",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-ligatures:none;mso-fareast-language:EN-GB">From:</span></b><span lang="EN-US" style="mso-ligatures:none;mso-fareast-language:EN-GB"> cwb-bounces@sslmit.unibo.it <cwb-bounces@sslmit.unibo.it>
<b>On Behalf Of </b>Michael Lynch<br>
<b>Sent:</b> Friday, April 21, 2023 5:14 AM<br>
<b>To:</b> cwb@sslmit.unibo.it<br>
<b>Subject:</b> [CWB] Corpus installation - admin v user<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal"><span lang="EN-AU">Hi all,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">I’ve been looking at ways around a problem with installing large corpora on our installation of CQPweb.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">Our university’s firewall cuts off web connections after around 20 seconds, which means that indexing and installing corpora over a certain size via the admin interface gets interrupted.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">The workaround for this so far has been to index and add metadata using command-line tools on the server, but I’d like to get installation working via the web interface.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">I’ve been looking through the CQPweb source, and have noticed that the process for user-installed corpora is managed using a job queue, which means that in theory it wouldn’t get interrupted by the firewall timeout.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">Is there any plan to rework the admin corpus installation code so that it uses the same queuing system?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">Alternatively, are there any differences between user- and admin-installed corpora, in terms of the functionality available once they’re installed?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">At the moment, we don’t allow users to upload corpora, but if we could grant installation privileges to our admin users so that they can install corpora using the user-installed system, it could be a way around the firewall
problem<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">Regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU">Mike<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
<div>
<div>
<p class="MsoNormal"><b><span lang="EN-AU" style="mso-ligatures:none;mso-fareast-language:EN-GB">Mike Lynch</span></b><span lang="EN-AU" style="mso-ligatures:none;mso-fareast-language:EN-GB"> (he/him) | Research Engineer Group Lead<o:p></o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-AU" style="mso-ligatures:none;mso-fareast-language:EN-GB">The University of Sydney<o:p></o:p></span></b></p>
<p class="MsoNormal"><span lang="EN-AU" style="mso-ligatures:none;mso-fareast-language:EN-GB">Sydney Informatics Hub | Core Research Facilities<o:p></o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-AU" style="mso-ligatures:none;mso-fareast-language:EN-GB">M</span></b><span lang="EN-AU" style="mso-ligatures:none;mso-fareast-language:EN-GB"> +61 478 872 039 |
<b>E </b><a href="mailto:m.lynch@sydney.edu.au">m.lynch@sydney.edu.au</a><b> </b><o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><span lang="EN-AU"><o:p> </o:p></span></p>
</div>
</div>
</body>
</html>