<html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><style>body { line-height: 1.5; }blockquote { margin-top: 0px; margin-bottom: 0px; margin-left: 0.5em; }body { font-size: 14px; font-family: Tahoma; color: rgb(0, 0, 0); line-height: 1.5; }</style></head><body>
<div><span></span>Thank you Hardie! I can understand overwriting exising index will destroy cached query and other things. However, from points of administrator or users' view, it is very normal to append new VRT files to an existing corpora and to still use the same URL of the updated existing corpora. If changed, it seems very weired.</div>
<div><br></div><hr style="width: 210px; height: 1px;" color="#b5c4df" size="1" align="left">
<div><span style="font-size: 10.6667px;"><div style="margin: 10px;"><font face="Verdana">Vincent Zhang</font></div></span></div>
<blockquote style="margin-Top: 0px; margin-Bottom: 0px; margin-Left: 0.5em; margin-Right: inherit"><div> </div><div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm"><div style="PADDING-RIGHT: 8px; PADDING-LEFT: 8px; FONT-SIZE: 12px;FONT-FAMILY:tahoma;COLOR:#000000; BACKGROUND: #efefef; PADDING-BOTTOM: 8px; PADDING-TOP: 8px"><div><b>From:</b> <a href="mailto:cwb-request@sslmit.unibo.it">cwb-request</a></div><div><b>Date:</b> 2023-10-16 20:32</div><div><b>To:</b> <a href="mailto:cwb@sslmit.unibo.it">cwb</a></div><div><b>Subject:</b> CWB Digest, Vol 199, Issue 7</div></div></div><div><div>Send CWB mailing list submissions to</div>
<div>        cwb@sslmit.unibo.it</div>
<div> </div>
<div>To subscribe or unsubscribe via the World Wide Web, visit</div>
<div>        http://liste.sslmit.unibo.it/mailman/listinfo/cwb</div>
<div>or, via email, send a message with subject or body 'help' to</div>
<div>        cwb-request@sslmit.unibo.it</div>
<div> </div>
<div>You can reach the person managing the list at</div>
<div>        cwb-owner@sslmit.unibo.it</div>
<div> </div>
<div>When replying, please edit your Subject line so it is more specific</div>
<div>than "Re: Contents of CWB digest..."</div>
<div> </div>
<div> </div>
<div>Today's Topics:</div>
<div> </div>
<div> 1. Re: How to append corpus data into an existing corpora?</div>
<div> (Hardie, Andrew)</div>
<div> </div>
<div> </div>
<div>----------------------------------------------------------------------</div>
<div> </div>
<div>Message: 1</div>
<div>Date: Mon, 16 Oct 2023 12:31:51 +0000</div>
<div>From: "Hardie, Andrew" <a.hardie@lancaster.ac.uk></div>
<div>To: Open source development of the Corpus WorkBench</div>
<div>        <cwb@sslmit.unibo.it></div>
<div>Subject: Re: [CWB] How to append corpus data into an existing corpora?</div>
<div>Message-ID:</div>
<div>        <LO4P265MB3485B676F09D9441E65BC6AFCBD7A@LO4P265MB3485.GBRP265.PROD.OUTLOOK.COM></div>
<div>        </div>
<div>Content-Type: text/plain; charset="utf-8"</div>
<div> </div>
<div>I mean it cannot be done at all. You need to start over. As you indicate ? because this?</div>
<div> </div>
<div>>> we can instead only run cwb-encode command to re-index and overwrite the existing corpora index</div>
<div> </div>
<div>=starting over. So it?s starting over whether you do it via the web UI or the CLI.</div>
<div> </div>
<div>But overwriting the existing index is a bad idea, because any saved queries that referenced the index will still point there ? but now they are no longer pointing at the same data.</div>
<div> </div>
<div>Better to have parallel names with a changeable suffix:</div>
<div> </div>
<div>mycorpus-01</div>
<div>mycorpus-02</div>
<div>?</div>
<div> </div>
<div>or</div>
<div> </div>
<div>mycorpus-20231015</div>
<div>mycorpus-20231016</div>
<div>?</div>
<div> </div>
<div>So that there will not be confusion regarding what corpus any given saved query is associated with. (whether or not you opt to delete older indexes).</div>
<div> </div>
<div>best</div>
<div> </div>
<div>Andrew.</div>
<div> </div>
<div>From: cwb-bounces@sslmit.unibo.it <cwb-bounces@sslmit.unibo.it> On Behalf Of ???</div>
<div>Sent: Monday, October 16, 2023 12:46 PM</div>
<div>To: cwb@sslmit.unibo.it</div>
<div>Subject: Re: [CWB] CWB Digest, Vol 199, Issue 5</div>
<div> </div>
<div>Thank you, Andrew! Do you mean we cannot make it on the admin-ui webpage, we can instead only run cwb-encode command to re-index and overwrite the existing corpora index? If so, it really sucks.It cannot be done by adding more files via the web-ui.</div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div>Vincent Zhang</div>
<div> </div>
<div>From: cwb-request@sslmit.unibo.it<mailto:cwb-request@sslmit.unibo.it></div>
<div> </div>
<div>Date: 2023-10-16 18:00:01</div>
<div> </div>
<div>To: cwb@sslmit.unibo.it<mailto:cwb@sslmit.unibo.it></div>
<div> </div>
<div>Subject: CWB Digest, Vol 199, Issue 5>Send CWB mailing list submissions to</div>
<div> </div>
<div>> cwb@sslmit.unibo.it<mailto:cwb@sslmit.unibo.it></div>
<div> </div>
<div>></div>
<div> </div>
<div>>To subscribe or unsubscribe via the World Wide Web, visit</div>
<div> </div>
<div>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb</div>
<div> </div>
<div>>or, via email, send a message with subject or body 'help' to</div>
<div> </div>
<div>> cwb-request@sslmit.unibo.it<mailto:cwb-request@sslmit.unibo.it></div>
<div> </div>
<div>></div>
<div> </div>
<div>>You can reach the person managing the list at</div>
<div> </div>
<div>> cwb-owner@sslmit.unibo.it<mailto:cwb-owner@sslmit.unibo.it></div>
<div> </div>
<div>></div>
<div> </div>
<div>>When replying, please edit your Subject line so it is more specific</div>
<div> </div>
<div>>than "Re: Contents of CWB digest..."</div>
<div> </div>
<div>></div>
<div> </div>
<div>></div>
<div> </div>
<div>>Today's Topics:</div>
<div> </div>
<div>></div>
<div> </div>
<div>> 1. How to append corpus data into an existing corpora?</div>
<div> </div>
<div>> (wzzhang@shisu.edu.cn<mailto:wzzhang@shisu.edu.cn>)</div>
<div> </div>
<div>> 2. Re: How to append corpus data into an existing corpora?</div>
<div> </div>
<div>> (Hardie, Andrew)</div>
<div> </div>
<div>></div>
<div> </div>
<div>></div>
<div> </div>
<div>>----------------------------------------------------------------------</div>
<div> </div>
<div>></div>
<div> </div>
<div>>Message: 1</div>
<div> </div>
<div>>Date: Mon, 16 Oct 2023 13:59:39 +0800</div>
<div> </div>
<div>>From: "wzzhang@shisu.edu.cn<mailto:wzzhang@shisu.edu.cn>" <wzzhang@shisu.edu.cn<mailto:wzzhang@shisu.edu.cn>></div>
<div> </div>
<div>>To: cwb <cwb@sslmit.unibo.it<mailto:cwb@sslmit.unibo.it>></div>
<div> </div>
<div>>Subject: [CWB] How to append corpus data into an existing corpora?</div>
<div> </div>
<div>>Message-ID: <202310161358581732745@shisu.edu.cn<mailto:202310161358581732745@shisu.edu.cn>></div>
<div> </div>
<div>>Content-Type: text/plain; charset="gb2312"</div>
<div> </div>
<div>></div>
<div> </div>
<div>>Hello everyone,</div>
<div> </div>
<div>>I found nowhere to append a new VRT file into an existing corpora. If it lack this feature, how to sustainably improve a corpora?</div>
<div> </div>
<div>></div>
<div> </div>
<div>></div>
<div> </div>
<div>></div>
<div> </div>
<div>>Vincent Zhang</div>
<div> </div>
<div>>Institute of Corpus Studies and Applications, Shanghai International Studies University</div>
<div> </div>
<div>>-------------- next part --------------</div>
<div> </div>
<div>>An HTML attachment was scrubbed...</div>
<div> </div>
<div>>URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20231016/ef192825/attachment-0001.html></div>
<div> </div>
<div>></div>
<div> </div>
<div>>------------------------------</div>
<div> </div>
<div>></div>
<div> </div>
<div>>Message: 2</div>
<div> </div>
<div>>Date: Mon, 16 Oct 2023 06:19:46 +0000</div>
<div> </div>
<div>>From: "Hardie, Andrew" <a.hardie@lancaster.ac.uk<mailto:a.hardie@lancaster.ac.uk>></div>
<div> </div>
<div>>To: Open source development of the Corpus WorkBench</div>
<div> </div>
<div>> <cwb@sslmit.unibo.it<mailto:cwb@sslmit.unibo.it>></div>
<div> </div>
<div>>Subject: Re: [CWB] How to append corpus data into an existing corpora?</div>
<div> </div>
<div>>Message-ID:</div>
<div> </div>
<div>> <LO4P265MB3485AD0D1262A6549EBA62EECBD7A@LO4P265MB3485.GBRP265.PROD.OUTLOOK.COM<mailto:LO4P265MB3485AD0D1262A6549EBA62EECBD7A@LO4P265MB3485.GBRP265.PROD.OUTLOOK.COM>></div>
<div> </div>
<div>></div>
<div> </div>
<div>>Content-Type: text/plain; charset="us-ascii"</div>
<div> </div>
<div>></div>
<div> </div>
<div>>That's because you can't do it.</div>
<div> </div>
<div>></div>
<div> </div>
<div>>You have to create a new corpus index from your original files with your new files appended to them.</div>
<div> </div>
<div>></div>
<div> </div>
<div>>Each CWB index then corresponds to the state of your corpus at some particular moment in time. (This is actually desirable from the point of view of replicability of results.)</div>
<div> </div>
<div>></div>
<div> </div>
<div>>best</div>
<div> </div>
<div>></div>
<div> </div>
<div>>Andrew.</div>
<div> </div>
<div>></div>
<div> </div>
<div>>From: cwb-bounces@sslmit.unibo.it<mailto:cwb-bounces@sslmit.unibo.it> <cwb-bounces@sslmit.unibo.it<mailto:cwb-bounces@sslmit.unibo.it>> On Behalf Of wzzhang@shisu.edu.cn<mailto:wzzhang@shisu.edu.cn></div>
<div> </div>
<div>>Sent: Monday, October 16, 2023 7:00 AM</div>
<div> </div>
<div>>To: cwb <cwb@sslmit.unibo.it<mailto:cwb@sslmit.unibo.it>></div>
<div> </div>
<div>>Subject: [CWB] How to append corpus data into an existing corpora?</div>
<div> </div>
<div>></div>
<div> </div>
<div>>Hello everyone,</div>
<div> </div>
<div>>I found nowhere to append a new VRT file into an existing corpora. If it lack this feature, how to sustainably improve a corpora?</div>
<div> </div>
<div>></div>
<div> </div>
<div>>________________________________</div>
<div> </div>
<div>>Vincent Zhang</div>
<div> </div>
<div>>Institute of Corpus Studies and Applications, Shanghai International Studies University</div>
<div> </div>
<div>>-------------- next part --------------</div>
<div> </div>
<div>>An HTML attachment was scrubbed...</div>
<div> </div>
<div>>URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20231016/38eb1612/attachment-0001.html></div>
<div> </div>
<div>></div>
<div> </div>
<div>>------------------------------</div>
<div> </div>
<div>></div>
<div> </div>
<div>>_______________________________________________</div>
<div> </div>
<div>>CWB mailing list</div>
<div> </div>
<div>>CWB@sslmit.unibo.it<mailto:CWB@sslmit.unibo.it></div>
<div> </div>
<div>>http://liste.sslmit.unibo.it/mailman/listinfo/cwb</div>
<div> </div>
<div>></div>
<div> </div>
<div>></div>
<div> </div>
<div>>End of CWB Digest, Vol 199, Issue 5</div>
<div> </div>
<div>>***********************************</div>
<div> </div>
<div>-------------- next part --------------</div>
<div>An HTML attachment was scrubbed...</div>
<div>URL: <http://liste.sslmit.unibo.it/pipermail/cwb/attachments/20231016/fe7f0fd6/attachment.html></div>
<div> </div>
<div>------------------------------</div>
<div> </div>
<div>_______________________________________________</div>
<div>CWB mailing list</div>
<div>CWB@sslmit.unibo.it</div>
<div>http://liste.sslmit.unibo.it/mailman/listinfo/cwb</div>
<div> </div>
<div> </div>
<div>End of CWB Digest, Vol 199, Issue 7</div>
<div>***********************************</div>
<div> </div>
</div></blockquote>
</body></html>