[CWB] Structural attributes as feature sets?

Jose Manuel Martinez Martinez jmmtra at gmail.com
Mon Apr 22 17:41:09 CEST 2013


Hi, Stefan,

thank you for the answer and the tips!

Just one more question, if I had subjects who don't know any foreign 
language, should I write...

     foreign_languages="||"

I assume, by anology, that should be the way to do it.

Or would it be enough to write

foreign_languages=""

Best,

jmm

El 21/04/13 13:39, Stefan Evert escribió:
>> I'm reading the corpus encoding tutorial and in section 5 I've found interesting stuff about feature sets for positional attributes. I am wondering if it would be possible to use such feature but with structural attributes.
> Yes.
>
>> Say that in my corpus I've collected information about the speakers, and some of them can speak more than one foreign language. I would like to have a structural attribute like
>>
>> foreing_languages="ES|PT|IT"
>>
>> for each text produced by that particular speaker.
> Simply encode them in feature set format as you would for positional attributes.  In your case, you need to add leading and trailing "|" separators as specified in the tutorial, e.g.
>
> 	<speaker foreign_languages="|ES|PT|IT|">
> 	...
> 	</speaker>
>
> and declare the foreign_languages XML attribute to be set valued (cf. "cwb-encode -h"):
>
> 	cwb-encode .... -S speaker:0+foreign_languages/
>
> (the trailing slash marks foreign_languages as a set-valued attribute).  cwb-encode will validate the set notation of attribute values and re-order the set elements alphabetically (keep in mind that sets are unordered, so you cannot specify first, second and third foreign language in this way).
>
> You will then be able to restrict searches to speakers who know Portuguese e.g. with a global constraint such as
>
> 	... query ... :: match.speaker_foreign_language contains "PT";
>
> Hope this helps,
> Stefan
>
>
> _______________________________________________
> CWB mailing list
> CWB en sslmit.unibo.it
> http://devel.sslmit.unibo.it/mailman/listinfo/cwb

------------ pr?xima parte ------------
Se ha borrado un adjunto en formato HTML...
URL: <http://devel.sslmit.unibo.it/pipermail/cwb/attachments/20130422/272d9268/attachment.html>


More information about the CWB mailing list