[CWB] Parallel corpora in CQPweb

Ruprecht von Waldenfels ruprecht.waldenfels at gmx.net
Mon Apr 11 18:44:51 CEST 2016


Hi,
the ParaVoz2 interface won't help you with getting the corpus aligned 
and such, but with an aligned corpus, it should be trivial to set up a 
working interface on a LAMP server, and probably also a windows server - 
it just needs CWB, php and javascript.
There are two versions: one is for the case that each document is a 
separate corpus (useful if you have many languages or multiple 
translations of the same texts), the second assumes all the texts of 
each language are in one file, respectively.
Here are the URLs: https://bitbucket.org/rvwfels/paravoz and 
https://bitbucket.org/rvwfels/paravoz2
Please write to me in case of any questions,
best,
Ruprecht
Am 11.04.2016 um 07:34 schrieb Philippe Baudrion:
> Yes, that's right, just tested with a dummy corpus and the output 
> looks the same.
> I guess I will give the ParaSol project a try too.
> Thank you for all your great help, best. Philippe
>
> On 04/11/2016 04:18 PM, Hardie, Andrew wrote:
>>
>> Hannah is correct on both counts.
>>
>> The “free translation” is primarily designed to support the third 
>> line of interlinear-annotated linguistic data (As per 
>> http://www.eva.mpg.de/lingua/resources/glossing-rules.php )
>>
>> It doesn’t actually use any of CWB/CQP’s inbuilt support for aligned 
>> parallel corpora.
>>
>> But it does allow you to get a translation to display in a 
>> rough-and-ready kind of way.
>>
>> This same trick is used by the team at BFSU (some of whom are on this 
>> list!) for their installation at http://111.200.194.212/cqp/ if I 
>> recall correctly. Here’s the earlier thread: 
>> http://devel.sslmit.unibo.it/pipermail/cwb/2014-February/001563.html **
>>
>> best
>>
>> Andrew.
>>
>> *From:*cwb-bounces at sslmit.unibo.it 
>> [mailto:cwb-bounces at sslmit.unibo.it] *On Behalf Of *Hannah Kermes
>> *Sent:* 11 April 2016 14:39
>> *To:* Philippe.Baudrion at unige.ch; Open source development of the 
>> Corpus WorkBench
>> *Subject:* Re: [CWB] Parallel corpora in CQPweb
>>
>> in principal yes, but with the "Free translation" you can show only 
>> one XML-attribute (at least as far as I know) at a time and (also as 
>> far as I know) only the Administrator can turn the visualization of 
>> and on.
>>
>> Best
>> Hannah
>>
>> ps: @Andrew: As Stefan told me the other day it might helps to say 
>> something often enough: it would be nice if users could turn such a 
>> free translation on an off
>>
>> Am 11.04.2016 um 15:30 schrieb Philippe Baudrion:
>>
>>     and it would also be possible to add a second attribute to store
>>     the italian version :-) for example?
>>     Thank you, Philippe
>>
>>     On 04/11/2016 03:25 PM, Hannah Kermes wrote:
>>
>>         Another possibility - also proposed on the list a while ago,
>>         I forgot who (sorry), but I am still grateful for that info -
>>         is to annotate the aligned sentence or chunk as an XML-attribute.
>>         An example from the GeCCO Corpus (English-German):
>>         <sentence aligned "Die Nationale Energiepolitik von Präsident
>>         Bush">
>>         President
>>         Bush
>>         ' s
>>         National
>>         Energy
>>         Policy
>>         </sentence>
>>         You can then display the xml attribute in CQPweb - You just
>>         have to select the respective XML-attribute under "Manage
>>         visualisation-(2)Free translation" with "Concordance only":
>>
>>
>>         Best
>>         Hannah
>>
>>         Am 11.04.2016 um 15:02 schrieb Andres Chandia:
>>
>>             Sorry  form my intromission, I reproduce here an old mail that may help you Philippe:
>>
>>             Dear List,
>>
>>             I am happy to say we have finally succeeded in publishing the
>>
>>             parallel
>>
>>             corpus interface to CWB we have used in ParaSol
>>
>>             (http://parasol.unibe.ch)
>>
>>             as an open source project here:
>>
>>             https://bitbucket.org/rvwfels/paravoz  .
>>
>>             The
>>
>>             package ParaVoz provides a simple, yet effective interface for a
>>
>>             parallel corpus using
>>
>>             OpenCWB (http://cwb.sourceforge.net). It should
>>
>>             work on any linux machine with only
>>
>>             minimal changes in the INI files to
>>
>>             reflect paths, and possibly adjustments concerning
>>
>>             language codes.
>>
>>             See the movie on the ParaSol website for ParaVoz in motion
>>
>>             http://parasol.unibe.ch/ParaSol_demo.mp4  and the bitbucket site for more
>>
>>             information
>>
>>             on installation.
>>
>>             Best,
>>
>>             Ruprecht
>>
>>             NB: We have christened the
>>
>>             corpus ParaVoz, which means Locomotive in
>>
>>             many Slavic languages; at the same time the
>>
>>             root -voz means 'bring' in
>>
>>             Slavic.
>>
>>             No. CQPweb has no support for
>>
>>             parallel corpora.  It's on the TODO list...  Andrew.  -----Original Message----- From:cwb-bounces at sslmit.unibo.it
>>             <http://mail.chandia.net/src/compose.php?send_to=cwb-bounces%40sslmit.unibo.it>
>>
>>             [mailto:cwb-bounces at sslmit.unibo.it
>>             <http://mail.chandia.net/src/compose.php?send_to=cwb-bounces@sslmit.unibo.it>]
>>
>>             On Behalf Of Philippe Baudrion Sent: 11 April 2016 13:58 To:cwb at sslmit.unibo.it
>>             <http://mail.chandia.net/src/compose.php?send_to=cwb%40sslmit.unibo.it>
>>
>>             Subject: [CWB] Parallel corpora in CQPweb  Dear all, following the last post [CWB]
>>
>>             sentence-Aligned parallel corpus in CWB by  José Manuel Martínez Martínez
>>
>>             chozelinek on Fri Feb 19 18:17:56 CET 2016  we have tried to use the script using
>>
>>             "cwb-align-import". Now, is there a way to install the corpus and registry files to
>>
>>             make  them available in CQPweb? Thank you for your help, Philippe  --  Baudrion Philippe
>>
>>             Correspondant Informatique  UNIVERSITE DE GENEVE Faculté de traduction et
>>
>>             d'interprétation 40, bd. du Pont d'Arve 1211 GENEVE 4  Tél +41 22 379 94 95
>>
>>             _______________________________________________ CWB mailing listCWB at sslmit.unibo.it
>>             <http://mail.chandia.net/src/compose.php?send_to=CWB%40sslmit.unibo.it>
>>
>>             http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>>
>>             _______________________________________________ CWB mailing listCWB at sslmit.unibo.it
>>             <http://mail.chandia.net/src/compose.php?send_to=CWB%40sslmit.unibo.it>
>>
>>
>>
>>
>>
>>             _______________________
>>                         andrés chandía
>>             chandia.net
>>             <http://www.chandia.net><https://twitter.com/andreschandia>
>>             administrador de:
>>             parles.upf <http://parles.upf.edu> | delingua
>>             <http://www.delingua.es> | amind terapia
>>             <http://amindterapia.com> | mapuche koyaktu
>>             <http://koyaktumapuche.net> | mail ong mapuche koyaktu
>>             <http://mail.corporacionkoyaktu.net> | mail psicoaching
>>             <http://mail.psicoaching.net> |
>>             P No imprima innecesariamente. ¡Cuide el medio ambiente!
>>
>>
>>             _______________________________________________
>>
>>             CWB mailing list
>>
>>             CWB at sslmit.unibo.it <mailto:CWB at sslmit.unibo.it>
>>
>>             http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>>
>>
>>
>>
>>
>>         _______________________________________________
>>
>>         CWB mailing list
>>
>>         CWB at sslmit.unibo.it <mailto:CWB at sslmit.unibo.it>
>>
>>         http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>>
>>
>>
>>     -- 
>>
>>     Baudrion Philippe
>>
>>     Correspondant Informatique
>>
>>     UNIVERSITE DE GENEVE
>>
>>     Faculté de traduction et d'interprétation
>>
>>     40, bd. du Pont d'Arve
>>
>>     1211 GENEVE 4
>>
>>     Tél +41 22 379 94 95
>>
>>
>>
>>
>>     _______________________________________________
>>
>>     CWB mailing list
>>
>>     CWB at sslmit.unibo.it <mailto:CWB at sslmit.unibo.it>
>>
>>     http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>>
>>
>>
>> _______________________________________________
>> CWB mailing list
>> CWB at sslmit.unibo.it
>> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
>
> -- 
> Baudrion Philippe
> Correspondant Informatique
>
> UNIVERSITE DE GENEVE
> Faculté de traduction et d'interprétation
> 40, bd. du Pont d'Arve
> 1211 GENEVE 4
>
> Tél +41 22 379 94 95
>
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20160411/7a76c8e7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 89688 bytes
Desc: not available
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20160411/7a76c8e7/attachment-0001.png>


More information about the CWB mailing list