[Lexicog] Spellchecking Unicode in MS-Office

Hayim Sheynin hayim.sheynin at GMAIL.COM
Fri Nov 14 16:56:49 UTC 2008


Dear Jan,

Would you consider changing the graphem to one of the following

Ĥ Latin capital letter H with circumflex (UnicodeLatin Extended-A 0124
ĥ  Latin small letter h with circumflex (Unicode Latin Extended-A 0125

or

Ħ Latin capital letter H with stroke (UnicodeLatin Extended-A 0126)
ħ  Latin small letter h with stroke (Unicode Latin Extended-A 0127)

If you didn't use these characters I think it worth to try them.

Best luck,

Hayim Sheynin

On Fri, Nov 14, 2008 at 6:05 AM, Jan F. Ullrich <jfu at centrum.cz> wrote:
>
>
> Dear lexicographers,
>
>
>
> I wonder if someone here could advice about the following problem. Years
> ago, before the full development of Unicode and Unicode fonts we created our
> own fonts for the Lakota language. In those fonts the needed characters were
> set to codes of characters not present in the language. For instance we used
> umlaut vowels. This solution was imperfect but beside other things it
> allowed us to take advantage of Microsoft Word spellchecking function. We
> simply created a list words and word forms and inserted them into a custom
> dictionary for MS-Word. Of course, this was a simplistic spellchecker, one
> that could not cover all the word forms of the language, but at that time it
> actually represented a helpful tool for our students and language teachers.
>
>
>
> A few years ago we transferred all of our textual materials into Unicode and
> we also programmed a powerful morphoparser and lemmatizer that help us
> create a quite comprehensive list of word forms. But we are having a problem
> using this list for spell-checking in MS-Office, because one of the Unicode
> characters that we use is not recognized by MS-Word. It is the Latin letter
> h with caron (U+021F)
> (http://www.fileformat.info/info/unicode/char/021f/index.htm). MS-Word won't
> consider this letter a part of the word no matter what we do and this
> disables using the spellchecking functions in that editor. It causes
> problems in other ways too, for instance in searching for words within a
> document, in various formatting operations etc.
>
> We found out that the character is recognized when we associate the text
> with a different language for spellchecking, for instance French, but then
> other characters are not recognized. If we keep the text assigned to English
> spellchecking (which is desired) then it is only h-caron that is not
> recognized.
>
>
>
> I do not know enough about Unicode to figure out how to solve this. For
> instance I don't know if the character recognition within MS-Word a feature
> of MS-Word or of Unicode. Couple years back we contacted Microsoft about
> this but we received no response.
>
> I am aware that there are other options for spell-checking a text, but since
> the MS-Word is such a main-stream editor used in most schools and colleges
> where the Lakota language is taught and used, it would be really nice if we
> could make the spellchecking function work in it.
>
>
>
> We would really appreciate any advice on how solve this?
>
>
>
>
>
> Jan
>
>
>
>
>
> Jan F. Ullrich, Linguistic Director
>
> Lakota Language Consortium
>
> www.lakhota.org
>
> e-mail: jfu at lakhota.org
>
> Skype: janfull
>
>
>
>
>
> 

------------------------------------

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/lexicographylist/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:lexicographylist-digest at yahoogroups.com 
    mailto:lexicographylist-fullfeatured at yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list