Digital race to save languages

Harold F. Schiffman haroldfs at ccat.sas.upenn.edu
Tue Jan 6 19:21:11 UTC 2004


Story from BBC NEWS:

     Digital race to save languages

     By Andy Webster in Melbourne

     Researchers are fighting against time to save decades of data on the
world's endangered languages from ending on the digital scrap heap.
Computer scientist and linguist Professor Steven Bird of Melbourne
University says most computer files, documents and original digital
recordings created more than 10 years ago are now virtually irretrievable.

     Linguists are worried because they have been enthusiastic digital
pioneers.  Attracted by ever smaller, lighter equipment and vastly
improved storage capacity, field researchers have graduated from
handwritten notes and wire recordings to laptops, mini-discs, DAT tape and
MP3.  "We are sitting between the onset of the digital era and the mass
extinction of the world's languages," said Prof Bird.

     "The window of opportunity is small and shutting fast."

     Languages disappearing

     "The problem is we are unable to ensure the digital storage lasts for
more than five to 10 years because of problems with new media formats, new
binary data formats used by software applications and the possibility that
magnetic storage just simply degrades over time," said Professor Bird.
When you record material in MP3 format now, what will happen in five
years' time when a new format comes along?  Prof Peter Austin, University
of London


     There are a number of initiatives across the world to ensure that
endangered languages are saved for future generations.  "Linguists
estimate that if we don't do anything, half of the world's languages will
disappear in the next 100 years," said Professor Peter Austin of the
School of Oriental and Africa Studies at the University of London.
"There are currently about 6,500 languages in the world, so that's 3,000
languages completely going, lost forever," he told the BBC programme Go
Digital.

     Professor Bird is involved in the Open Language Archive Community
(OLAC), an attempt to create a international network of internet-based
digital archives, using tailor-made software designed to be future-proof.
"We're devising ways of storing linguistic information using XML or
Extensible Markup Language, which is basically a language for representing
data on the web," said Prof Bird.  "XML is an open format that we can be
sure will be accessible indefinitely into the future."

     Cultural sensitivities

     Researchers across the world see the potential of XML, but are aware
of the burden this places on them.  "When you record material in MP3
format now, what will happen in five years' time when a new format comes
along?" asked Prof Austin.  "The real challenge for us as archivists is to
constantly upgrade the video, audio and image files that we have so that
they can be integrated with these new XML documents," he said.

     There are problems, however, with using the internet as a storage
medium.  Many indigenous communities fear it could lead to unrestricted
access to culturally sensitive material, such as sacred stories, which
could be abused or exploited, perhaps for commercial gain.  Professor Bird
says linguists recognise it is not a good idea to put sensitive material
onto the internet without any safeguards.

     "We are [looking at] the technologies used in internet banking for
secure transfer and control - right at the point this material is first
captured."  In theory, a field researcher would enter information about
future restrictions as the material is recorded or written down and those
safeguards would accompany the recording right through the data chain.

     Story from BBC NEWS:
     http://news.bbc.co.uk/go/pr/fr/-/2/hi/technology/2857041.stm

     Published: 2003/03/20 09:02:40 GMT

      BBC MMIV



More information about the Lgpolicy-list mailing list