[CWB] CWB Digest, Vol 95, Issue 6

Serge Heiden slh at ens-lyon.fr
Wed Dec 17 18:04:35 CET 2014


Hi Ingrid,

Another way to import your corpus into CQP, is to import it first in the 
TXM* software (which includes a library version of CQP). The builtin 
'XML/w+CSV' import module can do the job directly.

I attach to this email the resulting TXM binary version of your excerpt 
corpus (produced by exporting it from TXM once imported).
This file is a ZIP archive in which you will find in particular:
- /excerpt/wtc/excerpt.wtc: a .vrt version of your corpus (that you can 
use directly for CQP)
- /excerpt/txm/EXCERPT/excerpt_xmlcorpora.xml: an hybrid 'source XML' / 
XML-TEI TXM encoded version of your original source corpus that you may 
use in TEI friendly tools (like TXM)
- /excerpt/data/EXCERPT: all the CQP index files (word.lexicon.srt, 
word.lexicon.idx, etc.)
- /excerpt/registry/excerpt: the corresponding CQP registry file 
(generated by 'cwb-encode')

Best,
Serge
__________
Note :
*    http://sourceforge.net/projects/txm

Le 17/12/2014 14:00, Ingrid Sör a écrit :
> Thanks for your reply Ruprecht.
> I am sending you a short excerpt of the beginning of one corpus, as I 
> can't find information regarding if they are TEI or not and can't tell 
> myself. If you can see that it is TEI, I would be very happy to try 
> your XSLT script - very kind of you to share your code.
>
> Best, Ingrid
>
>
> On 17 December 2014 at 12:21, Ruprecht von Waldenfels 
> <waldenfels at issl.unibe.ch <mailto:waldenfels at issl.unibe.ch>> wrote:
>
>     Hi,
>     if this is TEI, I can send you my XSLT script.
>     Best,
>     Ruprecht
>     Am 17.12.2014 um 12:00 schrieb cwb-request at sslmit.unibo.it
>     <mailto:cwb-request at sslmit.unibo.it>:
>>     Send CWB mailing list submissions to
>>     	cwb at sslmit.unibo.it  <mailto:cwb at sslmit.unibo.it>
>>
>>     To subscribe or unsubscribe via the World Wide Web, visit
>>     	http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>>     or, via email, send a message with subject or body 'help' to
>>     	cwb-request at sslmit.unibo.it  <mailto:cwb-request at sslmit.unibo.it>
>>
>>     You can reach the person managing the list at
>>     	cwb-owner at sslmit.unibo.it  <mailto:cwb-owner at sslmit.unibo.it>
>>
>>     When replying, please edit your Subject line so it is more specific
>>     than "Re: Contents of CWB digest..."
>>
>>
>>     Today's Topics:
>>
>>         1. Bug report-CQPweb 3.1.11 (Umut Demirhan)
>>         2. Re: Bug report-CQPweb 3.1.11 (Hardie, Andrew)
>>         3. xml files (Ingrid S?r)
>>
>>
>>     _______________________________________________
>>     CWB mailing list
>>     CWB at sslmit.unibo.it  <mailto:CWB at sslmit.unibo.it>
>>     http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>     _______________________________________________
>     CWB mailing list
>     CWB at sslmit.unibo.it <mailto:CWB at sslmit.unibo.it>
>     http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb

-- 
Dr. Serge Heiden, slh at ens-lyon.fr, http://textometrie.ens-lyon.fr
ENS de Lyon/CNRS - ICAR UMR5191, Institut de Linguistique Française
15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33(0)622003883

-------------- section suivante --------------
Une pi?ce jointe HTML a ?t? nettoy?e...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20141217/66631b3a/attachment-0001.html>
-------------- section suivante --------------
Une pi?ce jointe autre que texte a ?t? nettoy?e...
Nom: excerpt.txm
Type: application/octet-stream
Taille: 47700 octets
Desc: non disponible
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20141217/66631b3a/attachment-0001.obj>


More information about the CWB mailing list