<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns:o = "urn:schemas-microsoft-com:office:office"><HEAD>
<META http-equiv=Content-Type content="text/html; charset=utf-8">
<META content="MSHTML 6.00.2722.900" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Here is another good link on the
topic:</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2><A
href="http://www.cse.unsw.edu.au/~waleed/gsl-rec/">http://www.cse.unsw.edu.au/~waleed/gsl-rec/</A></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>There are a few interesting things to consider as
one begins to look down this area, I started some initial research and pulled
back a while ago. </FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>First off cameras are bound to be a little
unreliable. To make microphones work there was significant research in noise
reduction. This work has not been done, as I am aware, for visual recognition
systems. Therefore, focusing on only the signer and not things in the background
is likely to be a problem. If possible VR devices like gloves which can more
precisely track movements without "noise" are likely to be more reliable in the
near term. </FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>Second, AI approaches are likely to have better
results. Recognition to "phonemes" (elements of a sign) and search based on that
alone are not as likely to have good results. Most speech recognition
systems that have had greater success are based on neural networks for this
reason. </FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>I am more than willing to talk offline more on what
I have learned from my initial research. It is quite limited in the specific
area of sign recognition. However, I have experience in very similar areas and
have touched on this one. </FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>I am even willing to volunteer some time to help
out if you seek to kick off such a product. </FONT></DIV>
<BLOCKQUOTE dir=ltr
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>
<A title=wayne@MRLANGUAGE.COM href="mailto:wayne@MRLANGUAGE.COM">Wayne
Smith</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A title=SW-L@ADMIN.HUMBERC.ON.CA
href="mailto:SW-L@ADMIN.HUMBERC.ON.CA">SW-L@ADMIN.HUMBERC.ON.CA</A> </DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Monday, January 20, 2003 9:37
PM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> SL Recognition system</DIV>
<DIV><BR></DIV>
<DIV><FONT face=Arial size=2>Valerie wrote:</FONT></DIV>
<DIV><FONT face=Arial size=2>.....</FONT>but wouldn't it be wonderful if we
could sign something on a camera attached to the computer and it would turn it
into written SignWriting? Is there "Sign Language Recognition Software"
developed yet? I know people are working on it...if so then later we could try
to coordinate it with SignWriting.</DIV>
<DIV> </DIV>
<DIV><FONT face=Arial size=2>Well, I know of one project something like that
in Taiwan. Here's an abstract of the dissertation of one Liang Rung-huei
(whom I don't know) who appears to be doing just that.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2> -
Wayne</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial
size=2>-----------------------------------------------</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV>
<P class=MsoNormal style="MARGIN: 0in 0in 0pt; TEXT-ALIGN: center"
align=center><B style="mso-bidi-font-weight: normal">A Real-time Continuous
Gesture Recognition System for <BR>Taiwanese Sign Language<o:p></o:p></B></P>
<P class=MsoNormal style="MARGIN: 0in 0in 0pt; TEXT-ALIGN: center"
align=center>Student: <SPAN lang=ZH-TW
style="FONT-FAMILY: PMingLiU; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">梁容輝</SPAN>
Advisor: <SPAN lang=ZH-TW
style="FONT-FAMILY: PMingLiU; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">歐陽明</SPAN></P>
<P class=MsoNormal style="MARGIN: 0in 0in 0pt; TEXT-ALIGN: center"
align=center><SPAN lang=ZH-TW
style="FONT-FAMILY: PMingLiU; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">國立臺灣大學資訊工程學研究所</SPAN></P>
<P class=MsoNormal style="MARGIN: 0in 0in 0pt; TEXT-ALIGN: center"
align=center>Abstract</P>
<P class=MsoNormal style="MARGIN: 0in 0in 0pt"> <o:p></o:p></P><SPAN
style="FONT-SIZE: 12pt; FONT-FAMILY: 'Times New Roman'; mso-bidi-font-size: 10.0pt; mso-fareast-font-family: PMingLiU; mso-font-kerning: 1.0pt; mso-fareast-language: ZH-TW; mso-ansi-language: EN-US; mso-bidi-language: AR-SA">In
this dissertation, a sign language interpreter is built for Taiwanese Sign
Language (TWL). This system is based on the fundamental vocabularies and
training sentences in the text book of sign language used by the first grade
of elementary schools in Taiwan. An instrumented glove, VPL’s DataGlove, is
used in the system to capture hand configurations for real-time recognition
with statistical approach. The major contributions of the proposed system are:
(1) it solves the important end-point detection problem in a stream of hand
motion and thus enables real-time continuous gesture recognition; (2) this
system is the first one to take a full set of sign language into
consideration, instead of focusing on a small set or a self-defined set of
gestures; (3) this is the first system that aims at automatic recognition of
Taiwanese Sign Language (TWL). To meet the requirements of large set of
vocabularies in sign language and to overcome the limitations of current
technologies in gesture recognition, three concepts in statistical language
learning are proposed: segmentation, hidden Markov model, and grammar model.
Segmentation is done by a strategy of monitoring time-varying parameters.
Hidden Markov models are built for each sub-gesture model, and a bigram scheme
is applied to adjacent gestures. Each gesture is decomposed into four
sub-gesture models: posture, position, orientation, and motion. In TWL, there
are 51 fundamental postures, 22 basic postions, 6 typical orientations, and
about 5 motion types. The system uses posture sequence as a stem of input
gestures and then sub-gesture models are recognized simultaneously. We have
implemented a system that includes a lexicon of 250 vocabularies, and 196
training sentences in Taiwanese Sign Language (TWL). This system requires a
training phase of postures for each user. Hidden Markov models (HMMs) for 51
fundamental postures, 6 orientations, and 12 motion primitives are
implemented. The recognition rates are 95%, 90.1%, and 87.5%, for posture,
orientation, and motion models respectively, and the recognition rate of an
isolated gesture is 82.8% and becomes 94.8% if the decision is within top
three candidates. A sentence of gestures based on these vocabularies can be
continuously recognized in real-time and the average recognition rates of
inside tests are 75.1% for phrases (in average 2.66 gestures per sentence) and
82.5% for sentences (in average 4.67 gestures per sentence). However, if top
three candidates are taken into account, the recognition rates described above
are 82.5% and 86.4%, and the average recognition rate is
84.6%.</SPAN></DIV></BLOCKQUOTE></BODY></HTML>