UNICODE UPDATTING

Constantine Chmielnicki wablenica at mail.ru
Tue Sep 13 04:54:05 UTC 2005


Hello Jimm,

Was that the problem of the incompatible information carrier (diskette etc.) or the weird character coding?
If the former, the problem is hard to fix, I'm afraid - you have to find the Tandy system that'd support both proprietary and IBM PC compatible formats
If the latter, the problem can be solved with character conversion - in case the coding can be deciphered, for example, if the texts are a mixture of ANSI code (plain English ABC letters) and some nonstandard codes for letters with diacritics.

I checked Google and found out that "The Tandy 1000 was a line of more or less  IBM PC compatible  home computer systems produced by the  Tandy Corporation for sale in its  Radio Shack chain of stores"

--Then your problems are not serious, I hope. The massive character conversion can be done, for example, with CC, Consistent Changes program, available at www.sil.org. You can send a sample file to me or somebody acquainted with CC (or similar software), and I'll make the special file, conversion table, for it.

The latest version of CC is quite user-friendly. You launch the program, select the conversion table file (say, IOM.cct), select the source and destination file and within second the conversion is done.

In CC you can convert the text into UTF8 Unicode format  that is understood by modern Windows and MS-Word.   (Although I prefer to work with some plain ABC substitutes for Unicode, as s^ for "esh", converting to Unicode in the end.)

Perhaps additional problem may arise if the texts are not in "plain text format" but in some proprietary format with some formatting information (fontfaces, font sizes, bold / italcs, etc.) added. Then additional work to strip the text of these formatting stuff should be done. If it is similar to Bushotter (+TA*KU T'HEPYA*PI KEYA*PI'? , NA'4 U'4* PAHA* KI'4 HE* ), or even with some weird (but consistently added) characters, then it is OK.

Toksha akhe
Constantine Chmielnicki
wablenica at mail.ru
  
======= At 2005-09-13, 05:20:16 you wrote: =======

>I was looking over government grants, and it  seems that there was an E-MELD 
>conference for the purpose of standardizing the documentation of languages, 
>especially endangered languages.
>Many people are all ready well into the composing of their particular 
>language dictionary.  The E-MELD conference proposes a number of standards, 
>called "best practices", which includes writting all dictionaries, and other 
>language work using unicode fonts.
>The thought is a good one, that one would no longer have the problem of 
>corruption in the transferr of fonts/ characters from one PC system to 
>another.  In whatever manner, fonts, diacritics, accents etc. that one 
>writes in using Unicode (Latest version 4.0.0), the same will be received 
>and viewed upon the receiving PC, as it was exactly written at the source of 
>origin.PC person   Of course, that will happen now when any PC shares the 
>same fonts as the sender.
>Some of us encountered this problem as we upgraded systems.  My initial 
>Ioway ~ Otoe-Missouria Dictionary, a Siouan Language, was written with a 
>Tandy's from Radio Shack, Inc, which is now an antique system.  Those 
>records composed on the Tandy can no longer be read by my present PC. 
>Fortunately, I had already converted them to a higher windows version,  Yet, 
>in some cases, accents and several special fonts where mutated irregardless.
>What is the thoughts of those who are well into their dictionary work and 
>may be confronted with the task of redoing it all over again in the Unicode 
>fonts.  Is it not unlike the large nations imposing their national language 
>on the minority languages, Tagalog, English, Japanese, et.al., on the 
>individual Filipino, the Native American and Spanish/ Chinease Americans or 
>the Ainu.   The plan for a standard is well meant, but devaluation  sets the 
>course for the minority community language to become an endangered language, 
>and with that, a whole culture world view and way of thinking.  Perhaps it 
>is not the same thing.  What are the thoughts of others, especially those 
>who have had to already go back into their documents and reedit the whole 
>work.
>Jimm 
>
>
>

= = = = = = = = = = = = = = = = = = = =
			
Constantine Chmielnicki
wablenica at mail.ru
2005-09-13



More information about the Siouan mailing list