<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
Hi, before reinventing the wheel I wanted to ask the CWB list whether anyone has already created an encoder script for the XML annotations used in the CLiC group’s <a href="http://clic.ub.edu/corpus" class="">Spanish corpora</a>? This annotation system is also
used in the <a href="https://catalog.ldc.upenn.edu/LDC2018T01" class="">DEFT Spanish treebank</a> and documented fairly exhaustively in this English-language publication:
<div class="">
<pre style="font-variant-ligatures: normal; orphans: 2; widows: 2; overflow-wrap: break-word; white-space: pre-wrap;" class=""><font face="Helvetica" class="">Soriano, B., O. Borrega, M. Taulé and M.A. Martí (2008) Guidelines,
3LB-WP-02-03, Universitat de Barcelona.
(<a href="http://clic.ub.edu/corpus/webfm_send/17" class="">http://clic.ub.edu/corpus/webfm_send/17</a>) </font></pre>
<div class="">It’s straightforward enough to thresh out the word (“wd”) attributes and morphology as positional attributes,</div>
<div class="">but my ambition is to encode the syntactic annotations as s-attributes as well, along the lines suggested in <a href="http://cwb.sourceforge.net/files/CWB_Encoding_Tutorial/node7.html" class="">the CWB manual</a>.</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">with grateful for any tips you might have,</div>
<div class="">-john</div>
<div class=""><br class="">
</div>
<div class="">
<div><font color="#ffffff" class="">SSN transport rule bypass code: 810-23-2567-984-015</font></div>
</div>
<br class="">
</div>
</body>
</html>