[Corpora-List] Q: Hyphenation removal
    Angus Grieve-Smith 
    grvsmth at panix.com
       
    Fri Aug 17 15:39:28 UTC 2012
    
    
  
On 8/16/2012 7:37 AM, Roland Schäfer wrote:
> are there any tools to remove hard-coded "hyphe- nation" from texts (or
> papers describing principled solutions to the problem).
     I'm sure that there's something out there and that someone on the 
list will know where to find it.
     I don't know about German, but in English there is significant 
ambiguity.  There are many instances where a hyphen is optional. 
Fortunately for your purpose, I believe that the differences in meaning 
are small enough that in those cases you could probably remove all the 
hyphens.  Some are even typographically motivated, such as 
"antiinflamatory," which exists but is used less often than 
"anti-inflammatory" because people seem to be uncomfortable writing two 
"i"s in the middle of a word in English.
     Maybe someone with more experience in this area can elaborate.
-- 
				-Angus B. Grieve-Smith
				grvsmth at panix.com
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
    
    
More information about the Corpora
mailing list