[CWB] Efficient way to count frequencies on large data
Sébastien Jacquot
sebastien.jacquot at univ-fcomte.fr
Fri Dec 18 14:37:08 CET 2015
Hi,
I'm looking for an efficient way to get the frequencies of repeated
token sequences on large corpora.
At this moment I use:
R = ([][][][]);
count R by word cut 20;
Is there a faster way to do that in terms of performances? (I mean for
example by directly grouping and counting the results rather than
getting all the results and then count them?)
Thanks in advance.
Sebastian
More information about the CWB
mailing list