<html dir="ltr">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">

<style id="owaParaStyle" type="text/css">

<!--

p

        {margin-top:0;

        margin-bottom:0}

p

        {margin-top:0;

        margin-bottom:0}

p

        {margin-top:0;

        margin-bottom:0}

p

        {margin-top:0;

        margin-bottom:0}

p

        {margin-top:0;

        margin-bottom:0}

p

        {margin-top:0;

        margin-bottom:0}

p

        {margin-top:0;

        margin-bottom:0}

p

        {margin-top:0;

        margin-bottom:0}

-->

P {margin-top:0;margin-bottom:0;}</style>

</head>

<body ocsi="0" fpstyle="1">

<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">Workshop On Vision And Language 2014 (VL'14), Dublin, 23rd August 2014

<br>

<div style="font-family: Times New Roman; color: #000000; font-size: 16px">

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">

<div style="font-family:Times New Roman; color:#000000; font-size:16px">

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">

<div style="font-family:Times New Roman; color:#000000; font-size:16px">

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">

<div style="font-family:Times New Roman; color:#000000; font-size:16px">

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">

<div style="font-family:Times New Roman; color:#000000; font-size:16px">

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">

<div style="font-family:Times New Roman; color:#000000; font-size:16px">

<div>

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">

<div style="font-family:Times New Roman; color:#000000; font-size:16px">

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt">

<div style="font-family:Times New Roman; color:#000000; font-size:16px">

<div>

<div style="direction:ltr; font-family:Tahoma; color:#000000; font-size:10pt"><br>

The 3rd Annual Meeting Of The EPSRC Network On Vision & Language and The 1st Technical Meeting of the European Network on Integrating Vision and Language<br>

<br>

A Workshop of the 25th International Conference on Computational Linguistics (COLING 2014)

<br>

<br>

<br>

Final Call for Participation<br>

<br>

<br>

KEYNOTE SPEAKER: ALEX JAIMES, YAHOO INC.<br>

<br>

<br>

Workshop Programme:<br>

<br>

*** 09.00 - 09.15 Introduction and Welcome to Workshop<br>

<br>

*** 09.15 - 10.30 Oral Papers Session 1: Interaction<br>

<br>

The Effect of Sensor Errors in Situated Human-Computer Dialogue<br>

Niels Schütte, John Kelleher and Brian Mac Namee<br>

<br>

Joint Navigation in Commander/Robot Teams: Dialog & Task Performance When Vision is Bandwidth-Limited<br>

Douglas Summers-Stay, Taylor Cassidy and Clare Voss<br>

<br>

TUHOI: Trento Universal Human Object Interaction Dataset<br>

Dieu-Thu Le, Jasper Uijlings and Raffaella Bernardi<br>

<br>

*** 10.30 - 11.00 Morning Coffee<br>

<br>

*** 11.00 - 11.40 Invited Keynote Talk - Alex Jaimes, Yahoo! Inc.<br>

<br>

*** 11.40 - 12.30 Oral Papers Session 2: Language Descriptors<br>

<br>

Concept-oriented labelling of patent images based on Random Forests and proximity-driven generation of synthetic data<br>

Dimitris Liparas, Anastasia Moumtzidou, Stefanos Vrochidis and Ioannis Kompatsiaris<br>

<br>

Exploration of functional semantics of prepositions from corpora of descriptions of visual scenes<br>

Simon Dobnik and John Kelleher<br>

<br>

*** 12.30 - 13.30 Lunch<br>

<br>

*** 13.30 - 14.20 Oral Papers Session 3: Visual Indexing<br>

<br>

A Poodle or a Dog? Evaluating Automatic Image Annotation Using Human Descriptions at Different Levels of Granularity<br>

Josiah Wang, Fei Yan, Ahmet Aker and Robert Gaizauskas<br>

<br>

Key Event Detection in Video using ASR and Visual Data<br>

Niraj Shrestha, Aparna N. Venkitasubramanian and Marie-Francine Moens<br>

<br>

*** 14.20 - 15.00 Poster Boasters<br>

<br>

*** 15.30 - 17.00 Long Poster Papers (Parallel session)<br>

<br>

Twitter User Gender Inference Using Combined Analysis of Text and Image Processing<br>

Shigeyuki Sakaki, Yasuhide Miura, Xiaojun Ma, Keigo Hattori and Tomoko Ohkuma<br>

<br>

Semantic and geometric enrichment of 3D geo-spatial models with captioned photos and labelled illustrations<br>

Chris Jones, Paul Rosin and Jonathan Slade<br>

<br>

Weakly supervised construction of a repository of iconic images<br>

Lydia Weiland, Wolfgang Effelsberg and Simone Paolo Ponzetto<br>

<br>

Cross-media Cross-genre Information Ranking based on Multi-media Information Networks<br>

Tongtao Zhang, Haibo Li, Hongzhao Huang, Heng Ji, Min-Hsuan Tsai, Shen-Fu Tsai and Thomas Huang<br>

<br>

Speech-accompanying gestures in Russian: functions and verbal context<br>

Yulia Nikolaeva<br>

<br>

DALES: Automated Tool for Detection, Annotation, Labelling, and Segmentation of Multiple Objects in Multi-Camera Video Streams<br>

Mohammad Bhat and Joanna Isabelle Olszewska<br>

<br>

A Hybrid Segmentation of Web Pages for Vibro-Tactile Access on Touch-Screen Devices<br>

Waseem SAFI, Fabrice Maurel, Jean-Marc Routoure, Pierre Beust and Gaël Dias<br>

<br>

*** 15.30 - 17.00 Short Poster Papers (Parallel session)<br>

<br>

Expression Recognition by Using Facial and Vocal Expressions<br>

Gholamreza Anbarjafari and Alvo Aabloo<br>

<br>

Formulating Queries for Collecting Training Examples in Visual Concept Classification<br>

Kevin McGuinness, Feiyan Hu, Rami Albatal and Alan Smeaton<br>

<br>

Towards Succinct and Relevant Image Descriptions<br>

Desmond Elliott<br>

<br>

Coloring Objects: Adjective-Noun Visual Semantic Compositionality<br>

Dat Tien Nguyen, Angeliki Lazaridou and Raffaella Bernardi<br>

<br>

Multi-layered Image Representation for Image Interpretation<br>

Marina Ivasic-Kos, Miran Pobar and Ivo Ipsic<br>

<br>

The Last 10 Metres: Using Visual Analysis and Verbal Communication in Guiding Visually Impaired Smartphone Users to Entrances<br>

Anja Belz and Anil Bharath<br>

<br>

Keyphrase Extraction using Textual and Visual Features<br>

Yaakov HaCohen-Kerner, Stefanos Vrochidis, Dimitris Liparas, Anastasia Moumtzidou and Ioannis Kompatsiaris<br>

<br>

Towards automatic annotation of communicative gesturing<br>

Kristiina Jokinen and Graham Wilcock<br>

<br>

<br>

Background<br>

<br>

Fragments of natural language, in the form of tags, captions, subtitles, surrounding text or audio, can aid the interpretation of image and video data by adding context or disambiguating visual appearance. In addition, labelled images are essential for training

 object or activity classifiers. On the other hand, visual data can help resolve challenges in language processing such as word sense disambiguation. Studying language and vision together can also provide new insight into cognition and universal representations

 of knowledge and meaning. Meanwhile, sign language and gestures are languages that require visual interpretation.

<br>

<br>

We welcome papers describing original research combining language and vision. To encourage the sharing of novel and emerging ideas we also welcome papers describing new datasets, grand challenges, open problems, benchmarks and work in progress as well as survey

 papers. <br>

<br>

Topics of interest include (but are not limited to): <br>

<br>

 * Image and video labelling and annotation <br>

 * Computational modelling of human vision and language <br>

 * Multimodal human-computer communication <br>

 * Language-driven animation <br>

 * Assistive methodologies <br>

 * Image and video description <br>

 * Image and video search and retrieval <br>

 * Automatic text illustration <br>

 * Facial animation from speech <br>

 * Text-to-image generation <br>

<br>

<br>

Contact<br>

<br>

Email: vl-net@brighton.ac.uk<br>

Website: https://vision.cs.bath.ac.uk/VL_2014/<br>

<br>

<br>

Organisers <br>

<br>

Anja Belz, University of Brighton <br>

Kalina Bontcheva, University of Sheffield <br>

Darren Cosker, University of Bath <br>

Frank Keller, University of Edinburgh <br>

Sien Moens, University of Leuven<br>

Alan Smeaton, Dublin City University<br>

William Smith, University of York <br>

<br>

<br>

Programme Committee<br>

<br>

Yannis Aloimonos, University of Maryland, US<br>

Dimitrios Makris, Kingston University, UK<br>

Desmond Elliot, University of Edinburgh, UK<br>

Tamara Berg, Stony Brook, US<br>

Claire Gardent, CNRS/LORIA, France<br>

Lewis Griffin, UCL, UK<br>

Brian Mac Namee, Dublin Institute of Technology, Ireland<br>

Margaret Mitchell, University of Aberdeen, UK<br>

Ray Mooney, University of Texas at Austin, US<br>

Chris Town, University of Cambridge, UK<br>

David Windridge, University of Surrey, UK<br>

Lucia Specia, University of Sheffield, UK<br>

John Kelleher, Dublin Institute of Technology, Ireland<br>

Sergio Escalera, Autonomous University of Barcelona, Spain<br>

Erkut Erdem, Hacettepe University, Turkey<br>

Isabel Trancoso, INESC-ID, Portugal<br>

<br>

<br>

The EPSRC Network On Vision And Language (V&L Net) - http://www.vl-net.org.uk/<br>

<br>

The EPSRC Network on Vision and Language (V&L Net) is a forum for researchers from the fields of Computer Vision and Language Processing to meet, exchange ideas, expertise and technology, and form new partnerships. Our aim is to create a lasting interdisciplinary

 research community situated at the language- vision interface, jointly working towards solutions for some of today's toughest computational challenges, including image and video search, description of visual content and text-to-image generation.<br>

<br>

<br>

The European Network on Integrating Vision and Language (iV&L Net) - http://www.cost.eu/domains_actions/ict/Actions/IC1307<br>

<br>

The explosive growth of visual and textual data (both on the World Wide Web and held in private repositories by diverse institutions and companies) has led to urgent requirements in terms of search, processing and management of digital content. Solutions for

 providing access to or mining such data depend on the semantic gap between vision and language being bridged, which in turn calls for expertise from two so far unconnected fields: Computer Vision (CV) and Natural Language Processing (NLP). The central goal

 of iV&L Net is to build a European CV/NLP research community, targeting 4 focus themes: (i) Integrated Modelling of Vision and Language for CV and NLP Tasks; (ii) Applications of Integrated Models; (iii) Automatic Generation of Image & Video Descriptions;

 and (iv) Semantic Image & Video Search. iV&L Net will organise annual conferences, technical meetings, partner visits, data/task benchmarking, and industry/end-user liaison. Europe has many of the world’s leading CV and NLP researchers. Tapping into this expertise,

 and bringing the collaboration, networking and community building enabled by COST Actions to bear, iV&L Net will have substantial impact, in terms of advances in both theory/methodology and real world technologies.<br>

<br>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

</div>

<br clear="both">

___________________________________________________________<BR>

This email has been scanned by MessageLabs' Email Security<BR>

System on behalf of the University of Brighton.<BR>

For more information see http://www.brighton.ac.uk/is/spam/<BR>

___________________________________________________________<BR>

</body>

</html>