[Corpora-List] Q: Hyphenation removal
Angus Grieve-Smith
grvsmth at panix.com
Fri Aug 17 15:39:28 UTC 2012
On 8/16/2012 7:37 AM, Roland Schäfer wrote:
> are there any tools to remove hard-coded "hyphe- nation" from texts (or
> papers describing principled solutions to the problem).
I'm sure that there's something out there and that someone on the
list will know where to find it.
I don't know about German, but in English there is significant
ambiguity. There are many instances where a hyphen is optional.
Fortunately for your purpose, I believe that the differences in meaning
are small enough that in those cases you could probably remove all the
hyphens. Some are even typographically motivated, such as
"antiinflamatory," which exists but is used less often than
"anti-inflammatory" because people seem to be uncomfortable writing two
"i"s in the middle of a word in English.
Maybe someone with more experience in this area can elaborate.
--
-Angus B. Grieve-Smith
grvsmth at panix.com
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list