[CWB] Install corpus: is there a way to "select all files"?
Jiayue Wang
arthur0421 at gmail.com
Sun Nov 20 14:51:25 CET 2016
Oh I see, I never realized that a single file could contain more than
one <text>...</text> span. Thanks very much Andrew :)
Best,
Jiayue
On 20/11/16 13:33, Hardie, Andrew wrote:
> CQPweb (like CWB generally) doesn't care whether the data comes in
> one file or many. It cares about the <text> elements. It doesn't
> matter whether you have many input files each with a <text id="XXX">
> ... </text> covering the whole file, or a single input file with many
> <text id="XXX"> ... </text> spans lined up one after another. The
> outcome is 100% the same.
>
> If you have metadata relating to the texts (i.e. your Name, Age,
> Major fields) then you can import it into the system two ways. (1)
> put the metadata in a tab-delimited text file with text IDs in the
> first column; upload this file; use it to set up text metadata. OR,
> (2) have the metadata as additional attributes on the <text> element;
> declare these when indexing; then import the text metadata from the
> resulting s-attributes.
>
> So, in short, it's not impossible at all!
>
> best
>
> Andrew.
>
> -----Original Message----- From: cwb-bounces at sslmit.unibo.it
> [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Jiayue Wang Sent:
> 20 November 2016 13:27 To: Open source development of the Corpus
> WorkBench Subject: Re: [CWB] Install corpus: is there a way to
> "select all files"?
>
> Thanks Andrew, that's what I did at last. In fact the 1000+ files are
> student essays, each of which has such info as Name, Age, Major and
> so on, so I hope they can be separate files, but with just one or a
> few files (concatenated ones) I guess the annotation of such
> individual properties would be impossible? (In the few corpus files I
> finally used none of those properties exist.)
>
> Best Jiayue
>
> On 20/11/16 11:02, Hardie, Andrew wrote:
>>
>>
>> -----Original Message----- From: Hardie, Andrew Sent: 18 November
>> 2016 12:55 To: Open source development of the Corpus WorkBench
>> Subject: RE: [CWB] Install corpus: is there a way to "select all
>> files"?
>>
>> Easiest solution: concatenate the files together (on the command
>> line using cat). Then you only have to tick one little checkbox.
>>
>> e.g.
>>
>> cat *.txt > MyBigInputFile
>>
>> or whatever.
>>
>> best
>>
>> Andrew.
>>
>>
>> -----Original Message----- From: cwb-bounces at sslmit.unibo.it
>> [mailto:cwb-bounces at sslmit.unibo.it] On Behalf Of Jiayue Wang Sent:
>> 18 November 2016 10:00 To: Open source development of the Corpus
>> WorkBench Subject: Re: [CWB] Install corpus: is there a way to
>> "select all files"?
>>
>> Sorry I forgot to mention that I was working on CQPweb.
>>
>> After selecting all the files and clicking Install, CQPweb told me
>> that my request exceeded the max URL length. So what to do if I
>> want to install such corpora?
>>
>> Any help will be much appreciated.
>>
>> Jiayue
>>
>> On 18/11/16 09:37, Jiayue Wang wrote:
>>> I'm trying to install a corpus of more than a thousand files. Is
>>> there a way in which I select all files listed, without having to
>>> click the little checkboxes one by one?
>>>
>>> Jiayue
>> _______________________________________________ CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>> _______________________________________________ CWB mailing list
>> CWB at sslmit.unibo.it
>> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>>
> _______________________________________________ CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
> _______________________________________________ CWB mailing list
> CWB at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/cwb
>
More information about the CWB
mailing list