[CWB] Key specification
Konrad Gołuchowski
kodie at mimuw.edu.pl
Sun May 13 16:06:43 CEST 2012
I tried this (in fact instead of <s> I have <p> tag) but I got following error.
SHELL CMD '/usr/local/bin/cwb-s-decode -n JRC-EN -S p_file_id' FAILED:
>> Non-zero exit value 1.
>> /usr/local/bin/cwb-s-decode: Can't access s-attribute <JRC-EN.p_file_id>
at /opt/local/bin/cwb-align-import line 229
Text "p_file_id" is worrying, probably it should be just "file_id".
2 first lines of aligment file:
jrc-en jrc-pl p id:{file_id}:{num}
id:jrc21970A0720_01-en:2 id:jrc21970A0720_01-pl:2 id:jrc21970A0720_01-pl:3
cwb-corpus-info:
============================================================
Corpus: jrc-en
============================================================
description:
registry file: /usr/local/share/cwb/registry/jrc-en
home directory: /Users/kodie/Projects/JRC-CWB/corpus-en/
info file: /Users/kodie/Projects/JRC-CWB/corpus-en/.info
size (tokens): 54258935
1 positional attributes:
word
4 structural attributes:
file file_id p p_num
0 alignment attributes:
Best
Konrad
On Sun, May 13, 2012 at 3:45 PM, Stefan Evert <stefanML at collocations.de> wrote:
>> I have a problem with key specification for cwb-align-import tool. I
>> have corpus with two structural tags. Sample file might look like
>> this:
>> <file id="filename">
>> <s num="1">
>> Hello
>> World
>> !
>> </s>
>> </file>
>>
>> I'd like to use key consisting both from file id and s num.
>
> If your alignment file uses keys e.g. of the form "filename_1" (for the sentence above), then the key specification in the first line of the alignment file should read
>
> SOURCE_CORPUS TARGET_CORPUS s {file_id}_{num}
>
> A very similar example can be found in the online help ("perldoc cwb-align-import"), at least if you have installed a recent version of the tool from the SVN repository.
>
> Cheers,
> Stefan
>
> _______________________________________________
> CWB mailing list
> CWB at sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
More information about the CWB
mailing list