[CWB] Structural attributes as feature sets?
Jose Manuel Martinez Martinez
jmmtra at gmail.com
Mon Apr 22 17:41:09 CEST 2013
Hi, Stefan,
thank you for the answer and the tips!
Just one more question, if I had subjects who don't know any foreign
language, should I write...
foreign_languages="||"
I assume, by anology, that should be the way to do it.
Or would it be enough to write
foreign_languages=""
Best,
jmm
El 21/04/13 13:39, Stefan Evert escribió:
>> I'm reading the corpus encoding tutorial and in section 5 I've found interesting stuff about feature sets for positional attributes. I am wondering if it would be possible to use such feature but with structural attributes.
> Yes.
>
>> Say that in my corpus I've collected information about the speakers, and some of them can speak more than one foreign language. I would like to have a structural attribute like
>>
>> foreing_languages="ES|PT|IT"
>>
>> for each text produced by that particular speaker.
> Simply encode them in feature set format as you would for positional attributes. In your case, you need to add leading and trailing "|" separators as specified in the tutorial, e.g.
>
> <speaker foreign_languages="|ES|PT|IT|">
> ...
> </speaker>
>
> and declare the foreign_languages XML attribute to be set valued (cf. "cwb-encode -h"):
>
> cwb-encode .... -S speaker:0+foreign_languages/
>
> (the trailing slash marks foreign_languages as a set-valued attribute). cwb-encode will validate the set notation of attribute values and re-order the set elements alphabetically (keep in mind that sets are unordered, so you cannot specify first, second and third foreign language in this way).
>
> You will then be able to restrict searches to speakers who know Portuguese e.g. with a global constraint such as
>
> ... query ... :: match.speaker_foreign_language contains "PT";
>
> Hope this helps,
> Stefan
>
>
> _______________________________________________
> CWB mailing list
> CWB en sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb
------------ pr?xima parte ------------
Se ha borrado un adjunto en formato HTML...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20130422/272d9268/attachment.html>
More information about the CWB
mailing list