[Sigwac] Gold standard data set corrupted - missing 103.txt

Tom Morris tfmorris at gmail.com
Thu Mar 3 19:47:53 CET 2016


Does anyone have the 103.txt which is supposed to be in the Gold Standard
data set (http://cleaneval.sigwac.org.uk/GoldStandard.tar.gz) ?

The current 103.txt is, despite it's name, actually a tar file made up of
all the other files. My guess is that someone typed:

    $ tar cvf *.txt

and the shell expanded that to

    $ tar cvf 103.txt 104.txt 105.txt ...

overwriting the original contents of the file with the tar containing all
the other files.

If a corrected version of GoldStandard.tar.gz could be made available, that
would be great.

Best regards,
Tom Morris


More information about the Sigwac mailing list