[CWB] Highlighting in alignment

Georg Jaehnig georg at jaehnig.org
Mon Jul 26 18:20:57 CEST 2010


Hi,

On Sat, Jul 17, 2010 at 02:40, Stefan Evert <stefanML at collocations.de> wrote:
>> I'm using the EUROPARL Web interface installed here:
>> http://bramaputra.ling.uni-potsdam.de/~jaehnig/CQP/Europarl/frames-cqp.html
>>
>> It allows me to do a search like this:
>>
>> "Diskussion" : EXAMPLE-NL "discussie"
>>
>> Now, is there a way to highlight "discussie" in the results - as
>> "Diskussion" is highlighted?
>
> Unfortunately, the answer is no.
>
> The implementation of alignment queries in CQP is rather half-hearted, so there is no way to identify and highlight the matching token in the target language (which would arguably be difficult for negative alignment constraints :).
>
> In principle it would be possible (though quite expensive) to simulate this in the Web GUI -- execute the query, identify the aligned regions in the target corpus, and then run the aligned part of the query on these aligned regions -- but this functionality isn't offered by the current version of the Europarl Web interface and would require quite some work to implement.

I am thinking about implementing this into the Web GUI.

So still assuming, my search is

    "Diskussion" : EXAMPLE-NL "discussie"

which gave me several sentences in German and Dutch containing
"Diskussion" and "discussie".

Now, for every sentence, this

    my ( $t_s, $t_e ) = My::KWIC::Translate( $s, $e, $lang );

gives me the IDs  of the first and last word of the aligned Dutch
sentence (in $t_s and $t_e).

How can I now call a query searching for "discussie" only looking
within the single Dutch sentence, i.e. within between $t_s and $t_e?

As I managed to find out, the function calling a query is

$cache->query(-corpus => $corpus, -query => $query, [-subquery =>
$subquery,] [-sort => $sort_clause,] [-keyword => $keyword_command,]
[-cut => $max_matches]);'

where I don't see a possibility to restrict it to a certain word ID range.

-- 
Georg | http://serchilo.net - command the web


More information about the CWB mailing list