[CWB] Line returns, etc.
Graham Ranger -- UAPV
graham.ranger at univ-avignon.fr
Mon Sep 25 15:35:23 CEST 2017
Hello to all,
When formatting a plain text corpus for viewing in cqpweb, I like to
remove unwanted line breaks that result from the use of OCR software.
This can be slightly awkward to implement and I was wondering whether it
is strictly necessary. Will search results be affected depending on the
presence / absence of line breaks or is the removal a waste of energy? I
imagine it is probably a question of whether the processing is line- or
stream-based... but I'd appreciate some expert opinion!
Thanks in advance.
Best,
Graham.
More information about the CWB
mailing list