[CWB] Dramatic texts in cqpweb

Blätte, Andreas andreas.blaette at uni-due.de
Mon Nov 6 12:07:58 CET 2017

Dear Graham,

as discussed in our bilateral communication – I now put the GermaParl corpus of German parliamentary debates in a GitHub repository that you’ll find here:

A CWB indexed version of the corpus is available, though wrapped in a R data package. Please note that the linguistically and indexed data is not yet exactly the same as the one in the repo with the TEI files.

To deal with the challenges you describe, I found it useful to develop a R wrapper for CQP, a package called polmineR (see github.com/PolMine/polmineR). To cut out interjections, or to specifically explore interjections, it is possible to create partitions/subcorpora using the structural annotation of the corpus.

I hope the package documentation has reached a stage that addresses most standard questions, but I would be happy to learn about the information that is missing.

Kind regards

Am 06.11.17, 11:11 schrieb "cwb-bounces at sslmit.unibo.it im Auftrag von Graham Ranger -- UAPV" <cwb-bounces at sslmit.unibo.it im Auftrag von graham.ranger at univ-avignon.fr>:


More information about the CWB mailing list