Conf: INTERSPEECH 2014, Keynote spekers, September 14-18, 2014, Singapore

Thierry Hamon hamon at LIMSI.FR
Wed Jun 25 08:36:19 UTC 2014

Date: Mon, 23 Jun 2014 14:08:03 +0800
From: "Organization @ Interspeech 2014" <organization at>
Message-ID: <53A7C443.8060302 at>

-            INTERSPEECH 2014 - SINGAPORE           -
-               September 14-18, 2014               -
-           -

ISCA, COLIPS and the organizing Committee of INTERSPEECH 2014 are proud
to announce that INTERSPEECH 2014 will feature five plenary talks by
internationally renowned experts.

- keynote speech
   by the ISCA Medallist 2014

- "Decision Learning in Data Science:
   Where John Nash Meets Social Media"
   by Professor K. J. Ray Liu

- "Language Diversity: Speech Processing In A Multi-Lingual Context"
   by Dr. Lori Lamel

- "Sound Patterns In Language"
   by Professor William Shi-Yuan WANG 王士元

- "Achievements and Challenges of Deep Learning
   From Speech Analysis And Recognition To Language
   And Multimodal Processing"
   by Dr. Li DENG

Details of the keynote speeches and biographies of the presenters are
given below.

Looking forward to welcome you in Singapore,
the organizing committee

* On Monday, 15th of September                                         *

The ISCA Medallist 2014 will give a keynote speech.
The name of the Medallist and subject of the talk will be
disclosed on the first day of INTERSPEECH 2014.

* On Tuesday morning, 16th of September                                *

Professor K. J. Ray Liu
Department of Electrical and Computer Engineering
University of Maryland, College Park

will give a presentation on:

"Decision Learning in Data Science: Where John Nash Meets Social Media"


     With the increasing ubiquity and power of mobile devices, as well
     as the prevalence of social media, more and more activities in our
     daily life are being recorded, tracked, and shared, creating the
     notion of “social media”.  Such abundant and still growing real
     life data, known as “big data”, provide a tremendous research
     opportunity in many fields.
     To analyze, learn and understand such user-generated big data,
     machine learning has been an important tool and various
     machine learning algorithms have been developed.
     However, since the user-generated big data is the outcome of users’
     decisions, actions and their socio-economic interactions, which are
     highly dynamic, without considering users’ local behaviours and
     interests, existing learning approaches tend to focus on optimizing
     a global objective function at the macroeconomic level, while
     totally ignore users’ local decisions at the micro-economic
     level. As such there is a growing need in bridging machine/social
     learning with strategic decision making, which are two
     traditionally distinct research disciplines, to be able to jointly
     consider both global phenomenon and local effects to
     understand/model/analyze better the newly arising issues in the
     emerging social media. In this talk, we present the notion of
     “decision learning” that can involve users's behaviours and
     interactions by combining learning with strategic decision making.
     We will discuss some examples from social media with real data to
     show how decision learning can be used to better analyze users’
     optimal decision from a user’ perspective as well as design a
     mechanism from the system designer’s perspective to achieve a
     desirable outcome.

Biography of the speaker

     Dr. K. J. Ray Liu was named a Distinguished Scholar-Teacher of
     University of Maryland in 2007, where he is Christine Kim Eminent
     Professor of Information Technology.
     He leads the Maryland Signals and Information Group conducting
     research encompassing broad areas of signal processing and
     communications with recent focus on cooperative communications,
     cognitive networking, social learning and decision making, and
     information forensics and security. Dr. Liu has received numerous
     honours and awards including IEEE Signal Processing Society 2009
     Technical Achievement Award and various best paper awards from IEEE
     Signal Processing, Communications, and Vehicular Technology
     Societies, and EURASIP. A Fellow of the IEEE and AAAS, he is
     recognized by Thomson Reuters as an ISI Highly Cited Researcher.
     Dr. Liu was the President of IEEE Signal Processing Society, the
     Editor-in-Chief of IEEE Signal Processing Magazine and the founding
     Editor-in-Chief of EURASIP Journal on Advances in Signal
     Processing. Dr. Liu also received various research and teaching
     recognitions from the University of Maryland, including Poole and
     Kent Senior Faculty Teaching Award, Outstanding Faculty Research
     Award, and Outstanding Faculty Service Award, all from A. James
     Clark School of Engineering; and Invention of the Year Award (three
     times) from Office of Technology Commercialization.

* On Tuesday afternoon, 16th of September                              *

Dr. Lori Lamel
Senior Research scientist (DR1), LIMSI-CNRS

will give a presentation on

"Language Diversity: Speech Processing In A Multi-Lingual Context"


     Speech processing encompasses a variety of technologies
     that automatically process speech for some downstream processing.
     These technologies include identifying the language or dialect
     spoken, the person speaking, what is said and how it is said.  The
     downstream processing may be limited to a transcription or to a
     transcription enhanced with additional meta-data, or may be used to
     carry out an action or interpreted within a spoken dialogue system
     or more generally for analytics.  With the availability of large
     spoken multimedia or multimodal data there is growing interest in
     using such technologies to provide structure and random access to
     particular segments. Automatic tools can also serve to annotate
     large corpora for exploitation in linguistic studies of spoken
     language, such as acoustic-phonetics, pronunciation variation and
     diachronic evolution, permitting the validation of hypotheses and
     In this talk I will present some of my experience with speech
     processing in multiple languages, drawing upon progress in the
     context of several research projects, most recently the Quaero
     program and the IARPA Babel program, both of which address the
     development of technologies in a variety of languages, with the aim
     to some highlight recent research directions and challenges.

Biography of the speaker

     I am a senior research scientist (DR1) at the CNRS, which I joined
     as a permanent researcher at LIMSI in October 1991.
     I received my Ph.D. degree in Electrical Engineering and Computer
     Science in May 1988 from the Massachusetts Institute of Technology.
     My research activities focus on large vocabulary speaker-
     independent, continuous speech recognition in multiple languages
     with a recent focus on low-resourced languages; lightly and
     unsupervised acoustic model training methods; studies in acoustic-
     phonetics; lexical and pronunciation modelling. I contributed to
     the design, and realization of large speech corpora (TIMIT, BREF,
     TED). I have been actively involved in the research projects, most
     recently leading the activities on speech processing in the OSEO
     Quaero program, and I am currently co-principal investigator for
     LIMSI as part of the IARPA Babel Babelon team led by BBN.
     I served on the Steering committee for Interspeech 2013 as
     co-technical program chair along with Pascal Perrier, and I am now
     serving on the Technical Program Committee of Interspeech 2014.

* On Wednesday, 17th of September                                      *

Professor William Shi-Yuan WANG 王士元
Centre for Language and Human Complexity,
Chinese University of Hong Kong
Professor Emeritus, University of California at Berkeley
Honorary Professor, Peking University
Academician, Academia Sinica

will give a presentation about

"Sound Patterns In Language"


     In contrast to other species, humans are unique in having developed
     thousands of diverse languages which are not mutually
     intelligible. However, any infant can learn any language with ease,
     because all languages are based upon common biological
     infrastructures of sensori-motor, memorial, and cognitive
     faculties.  While languages may differ significantly in the sounds
     they use, the overall organization is largely the same.
     It is divided into a discrete segmental system for building words ​
     and a continuous prosodic system for expressing, phrasing,
     attitudes, and emotions. Within this organization, I will discuss a
     class of languages called 'tone languages', which makes special use
     of F0 to build words.  Although the best known of these is Chinese,
     tone languages are found in many parts of the world, and operate on
     different principles. I will also comment on relations between
     sound patterns in language and sound patterns in music, the two
     worlds of sound universal to our species.

Biography of the speaker

     William S-Y. Wang received his early schooling in China, and his
     PhD from the University of Michigan.  He was appointed Professor of
     Linguistics at the University of California at Berkeley in 1965,
     and taught there for 30 years.
     Currently he is in the Department of Electronic Engineering and in
     the Department of Linguistics and Modern Languages of the Chinese
     University of Hong Kong, and Director of the newly established
     Joint Research Centre for Language and Human Complexity. His
     primary interest is the evolution of language from a multi-
     disciplinary perspective.

* On Thursday, 18th of September                                      *

Principal Researcher and Research Manager
Deep Learning Technology Centre,
Microsoft Research, Redmond, USA

will give a presentation on the

"Achievements and Challenges of Deep Learning
 From Speech Analysis And Recognition To Language And Multimodal


     Artificial neural networks have been around for over half a century
     and their applications to speech processing have been almost as
     long, yet it was not until year 2010 that their real impact had
     been made by a deep form of such networks, built upon part of the
     earlier work on (shallow) neural nets and (deep) graphical models
     developed by both speech and machine learning communities. This
     keynote will first reflect on the path to this transformative
     success, sparked by speech analysis using deep learning methods on
     spectrogram-like raw features and then progressing rapidly to
     speech recognition with increasingly larger vocabularies and scale.
     The role of well-timed academic-industrial collaboration will be
     highlighted, so will be the advances of big data, big compute, and
     the seamless integration between the application-domain knowledge
     of speech and general principles of deep learning. Then, an
     overview will be given on sweeping achievements of deep learning in
     speech recognition since its initial success in 2010 (as well as in
     image recognition and computer vision since 2012). Such
     achievements have resulted in across-the-board, industry-wide
     deployment of deep learning. The final part of the talk will look
     ahead towards stimulating new challenges of deep learning ---
     making intelligent machines capable of not only hearing (speech)
     and seeing (vision), but also of thinking with a “mind”; i.e.
     reasoning and inference over complex, hierarchical relationships
     and knowledge sources that comprise a vast number of entities and
     semantic concepts in the real world based in part on multi- sensory
     data from the user.  To this end, language and multimodal
     processing --- joint exploitation and learning from text,
     speech/audio, and image/video --- is evolving into a new frontier
     of deep learning, beginning to be embraced by a mixture of research
     communities including speech and spoken language processing,
     natural language processing, computer vision, machine learning,
     information retrieval, cognitive science, artificial intelligence,
     and data/knowledge management. A review of recent published studies
     will be provided on deep learning applied to selected language and
     multimodal processing tasks, with a trace back to the relevant
     early connectionist modelling and neural network literature and
     with future directions in this new exciting deep learning frontier
     discussed and analyzed.

Biography of the speaker

     Li Deng received Ph.D. from the University of Wisconsin-Madison.
     He was a tenured professor (1989-1999) at the University of
     Waterloo, Ontario, Canada, and then joined Microsoft Research,
     Redmond, where he is currently a Principal Research Manager of its
     Deep Learning Technology Centre.
     Since 2000, he has also been an affiliate full professor at the
     University of Washington, Seattle, teaching computer speech
     processing. He has been granted over 60 US or international
     patents, and has received numerous awards and honours bestowed by
     IEEE, ISCA, ASA, and Microsoft including the latest IEEE SPS Best
     Paper Award (2013) on deep neural nets for speech recognition. He
     authored or co-authored 4 books including the latest one on Deep
     Learning: Methods and Applications. He is a Fellow of the
     Acoustical Society of America, a Fellow of the IEEE, and a Fellow
     of the ISCA. He served as the Editor-in-Chief for IEEE Signal
     Processing Magazine (2009-2011), and currently as Editor-in-Chief
     for IEEE Transactions on Audio, Speech and Language Processing. His
     recent research interests and activities have been focused on deep
     learning and machine intelligence applied to large-scale text
     analysis and to speech/language/image multimodal processing,
     advancing his earlier work with collaborators on speech analysis
     and recognition using deep neural networks since 2009.

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

ATALA décline toute responsabilité concernant le contenu des
messages diffusés sur la liste LN

More information about the Ln mailing list