[Sigwac] Call for discussion: The SIGWAC crisis (instead, of an announcement of WAC-XI)

Miloš Jakubíček milos.jakubicek at sketchengine.co.uk
Wed Aug 2 09:45:30 CEST 2017


Hi Roland,

ah - thanks for explaining, sorry I did not get it first.
Makes totally sense now. And yes, we probably all agree we need more than
an index.
In fact, even Google does more than an index ("Google Archive").

Best
Milos

Milos Jakubicek

CEO, Lexical Computing
Brno, CZ | Brighton UK
http://www.lexicalcomputing.com
http://www.sketchengine.co.uk

On 1 August 2017 at 13:02, Roland Schäfer <roland.schaefer at fu-berlin.de>
wrote:

> Hi Miloš,
>
> On 01.08.17 12:22, Miloš Jakubíček wrote:
> > Hi Roland,
> >
> > Sorry I do not follow - what do you mean by index here, can you please
> > explain?
> >
>
> oh, sorry for being imprecise. In my understanding, a web search engine
> (or "web index") does not provide the full text data, but just
> provides... well, an index. Maybe I did not check thoroughly enough what
> the Yacy initiative was about, but a "linguist's search engine" would
> not provide the actual textual data according to my definition, but it
> would just provide links to websites and maybe some aggregated data. In
> other words, a search engine is not a copy of the data PLUS an index
> (which is what you have in CWB or NoSkE) but just the index. This
> creates a problem because the data themselves are not curated.
>
> >> 3. More importantly, indices do not lead to reproducible results (which
> >> was AFAIR one of Adam's main points in his seminal paper). Under the
> >> current guidelines of the German Research Council (DFG, the main
> >> third-party funding agency in DE) on textual resources, for example,
> >> mentioning the planned use of results obtained from web data using an
> >> index in a grant application should theoretically stand in the way of
> >> approving the grant.
> >>
> >
> > I think I still don't get what the indices stand for here, probably not
> an
> > index as in computer/database terms?
> > (At least I don't understand why would that stand in the way of any
> > funding...)
>
> Is it better now? Maybe the problem here is more the definition of the
> term "linguist's search engine", which basically triggered my doubts. I
> think I have a solid understanding of how DB indices work, at least for
> a mere linguist.
>
> Best,
> Roland
> _______________________________________________
> Sigwac mailing list
> Sigwac at sslmit.unibo.it
> http://liste.sslmit.unibo.it/mailman/listinfo/sigwac
>


More information about the Sigwac mailing list