SL Recognition system
mike.golebiewski at VERIZON.NET
Thu Jan 23 05:39:22 UTC 2003
Here is another good link on the topic:
There are a few interesting things to consider as one begins to look down this area, I started some initial research and pulled back a while ago.
First off cameras are bound to be a little unreliable. To make microphones work there was significant research in noise reduction. This work has not been done, as I am aware, for visual recognition systems. Therefore, focusing on only the signer and not things in the background is likely to be a problem. If possible VR devices like gloves which can more precisely track movements without "noise" are likely to be more reliable in the near term.
Second, AI approaches are likely to have better results. Recognition to "phonemes" (elements of a sign) and search based on that alone are not as likely to have good results. Most speech recognition systems that have had greater success are based on neural networks for this reason.
I am more than willing to talk offline more on what I have learned from my initial research. It is quite limited in the specific area of sign recognition. However, I have experience in very similar areas and have touched on this one.
I am even willing to volunteer some time to help out if you seek to kick off such a product.
----- Original Message -----
From: Wayne Smith
To: SW-L at ADMIN.HUMBERC.ON.CA
Sent: Monday, January 20, 2003 9:37 PM
Subject: SL Recognition system
.....but wouldn't it be wonderful if we could sign something on a camera attached to the computer and it would turn it into written SignWriting? Is there "Sign Language Recognition Software" developed yet? I know people are working on it...if so then later we could try to coordinate it with SignWriting.
Well, I know of one project something like that in Taiwan. Here's an abstract of the dissertation of one Liang Rung-huei (whom I don't know) who appears to be doing just that.
A Real-time Continuous Gesture Recognition System for
Taiwanese Sign Language
Student: 梁容輝 Advisor: 歐陽明
In this dissertation, a sign language interpreter is built for Taiwanese Sign Language (TWL). This system is based on the fundamental vocabularies and training sentences in the text book of sign language used by the first grade of elementary schools in Taiwan. An instrumented glove, VPL’s DataGlove, is used in the system to capture hand configurations for real-time recognition with statistical approach. The major contributions of the proposed system are: (1) it solves the important end-point detection problem in a stream of hand motion and thus enables real-time continuous gesture recognition; (2) this system is the first one to take a full set of sign language into consideration, instead of focusing on a small set or a self-defined set of gestures; (3) this is the first system that aims at automatic recognition of Taiwanese Sign Language (TWL). To meet the requirements of large set of vocabularies in sign language and to overcome the limitations of current technologies in gesture recognition, three concepts in statistical language learning are proposed: segmentation, hidden Markov model, and grammar model. Segmentation is done by a strategy of monitoring time-varying parameters. Hidden Markov models are built for each sub-gesture model, and a bigram scheme is applied to adjacent gestures. Each gesture is decomposed into four sub-gesture models: posture, position, orientation, and motion. In TWL, there are 51 fundamental postures, 22 basic postions, 6 typical orientations, and about 5 motion types. The system uses posture sequence as a stem of input gestures and then sub-gesture models are recognized simultaneously. We have implemented a system that includes a lexicon of 250 vocabularies, and 196 training sentences in Taiwanese Sign Language (TWL). This system requires a training phase of postures for each user. Hidden Markov models (HMMs) for 51 fundamental postures, 6 orientations, and 12 motion primitives are implemented. The recognition rates are 95%, 90.1%, and 87.5%, for posture, orientation, and motion models respectively, and the recognition rate of an isolated gesture is 82.8% and becomes 94.8% if the decision is within top three candidates. A sentence of gestures based on these vocabularies can be continuously recognized in real-time and the average recognition rates of inside tests are 75.1% for phrases (in average 2.66 gestures per sentence) and 82.5% for sentences (in average 4.67 gestures per sentence). However, if top three candidates are taken into account, the recognition rates described above are 82.5% and 86.4%, and the average recognition rate is 84.6%.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Sw-l