[CWB] Counting tokens, types and segments
Stefan Evert
stefanML at collocations.de
Sun Dec 12 23:00:43 CET 2010
Also, "cwb-describe-corpus -s" for type/token counts for all attributes, or "cwb-lexdecode -S" if you want to know the token size of a corpus.
You can easily get the information from CWB::CL, of course.
Best,
Stefan
On 12 Dec 2010, at 22:42, Alberto Simões wrote:
> On 12/12/2010 21:39, Alberto Simões wrote:
>> Hello.
>>
>> I am trying to count, using CWB, the number of tokens, types and
>> segments (annotations of "tu" type).
>>
>> For the first, I am using the size of A = [];
>>
>> For the second, I am being able to: group A matchend word
>> but it doesn't show me the total number of types.
>>
>> For the last, no idea how to do it... yet.
>
> This one is easy as well:
> A = <tu> [];
> size A;
>
> Now, missing one :D
>
> Thanks
> Alberto
More information about the CWB
mailing list