[Sigwac] Call for discussion: The SIGWAC crisis (instead, of an announcement of WAC-XI)

Eros Zanchetta eros at sslmit.unibo.it
Tue Aug 1 13:34:46 CEST 2017


Hi there,

I'm currently the main (well, the only) developer of BootCaT and I've 
been experimenting with Yacy.

On 01/08/2017 10:26, Miloš Jakubíček wrote:
> As for Yacy -- technologically I like that, but I'm a bit afraid at the
> moment its index might be too small and therefore too biased --

Yes, the index is definitely very small, crawling takes a long time and 
queries takes a bit longer than commercial search engines. But you do 
get some results, so I think it's worth exploring the option.

> this might
> be a builtin issue: if a big corporation starts indexing their data with
> Yacy, will it be able to skew the results?

That could happen but I'm not sure that's a problem (at least as far as 
BootCaT is concerned) for 2 reasons:

1) I'm not sure that big corporations would be interested in doing that 
(Yacy AFAIK has basically no users, who would want to invest time to 
game the results?)

2) BootCaT's approach of using tuples of specific terms restricts the 
results so much that most of the time you want *all* the results you can 
get, so ranking becomes somewhat less relevant

Best,
Eros


More information about the Sigwac mailing list