[CWB] RE: Badly-formatted text ID codes
Eros Zanchetta
eros.zanchetta2 at unibo.it
Mon Jan 16 14:28:13 CET 2012
On Jan 16, 2012, at 2:21 PM, Hardie, Andrew wrote:
> No, because url-encoding allows non-word characters (like % and +); if you rolled your own recoding that avoided those characters, many of the resulting values would still be too long (and truncation might lead to duplicates). You need to add a proper ID attribute.
>
> I can send you the script I wrote to do this in itWaC if you'd like.
Yes, that would be very helpful, thank you!
E
More information about the CWB
mailing list