The 5th Workshop on Vision and Language (VL'16) will be held on August 12 and hosted by the 54th Annual Meeting of the Association for Computational Linguistics (ACL), in Berlin, Germany. The workshop is being organised by COST Action IC1307 The European Network on Integrating Vision and Language (iV&L Net).
Research involving both language and vision computing spans a variety of disciplines and applications, and goes back a number of decades. In a recent scene shift, the big data era has thrown up a multitude of tasks in which vision and language are inherently linked. The explosive growth of visual and textual data, both online and in private repositories by diverse institutions and companies, has led to urgent requirements in terms of search, processing and management of digital content. Solutions for providing access to or mining such data effectively depend on the connection between visual and textual content being made interpretable, hence on the 'semantic gap' between vision and language being bridged.
One perspective has been integrated modelling of language and vision, with approaches located at different points between the structured, cognitive modelling end of the spectrum, and the unsupervised machine learning end, with state-of-the-art results in many areas currently being produced at the latter end, in particular by deep learning approaches.
Another perspective is exploring how knowledge about language can help with predominantly visual tasks, and vice versa. Visual interpretation can be aided by text associated with images/videos and knowledge about the world learned from language. On the NLP side, images can help ground language in the physical world, allowing us to develop models for semantics. Words and pictures are often naturally linked online and in the real world, and each modality can provide reinforcing information to aid the other.
The 5th Workshop on Vision and Language (VL’16) aims to address all the above, with a particular focus on the integrated modelling of vision and language. We welcome papers describing original research combining language and vision. To encourage the sharing of novel and emerging ideas we also welcome papers describing new data sets, grand challenges, open problems, benchmarks and work in progress as well as survey papers.
Topics of interest include (in alphabetical order), but are not limited to:
13 January 2016: First Call for Workshop Papers
8 May 2016: Workshop Paper Due Date
5 June 2016: Notification of Acceptance
22 June 2016: Camera-ready papers due
12 August 2016: Workshop Date
09:00 - 10:30
Anya Belz: Opening
Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price and Ahmed Elgammal: Automatic Annotation of Structured Facts in Images
Manuela Hürlimann and Johan Bos: Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images
10:30 - 11:00
11:00 - 12:30
Invited talk: Yejin Choi: Language and Vision: Learning Knowledge about the World
Micah Hodosh and Julia Hockenmaier: Focused Evaluation for Image Description
12:30 - 14:00
14:00 - 15:30
Mert Kilickaya, Nazli Ikizler-Cinbis, Erkut Erdem and Aykut Erdem: Leveraging Captions in the Wild to Improve Object Detection
Quick-fire presentations for posters (5mins each)
15:30 - 16:00
16:00 - 17:30
Nouf Alharbi and Yoshihiko Gotoh: Natural Language Descriptions of Human Activities Scenes: Corpus Generation and Analysis (LP)
Yanchao Yu, Arash Eshghi and Oliver Lemon: Interactively learning visually grounded word meanings from a human tutor
Emiel van Miltenburg, Roser Morante and Desmond Elliott: Pragmatic factors in image description: the case of negations
Sandro Pezzelle, RAVI SHEKHAR and Raffaella Bernardi: a bagpipe with a bag and a pipe: Exploring Conceptual Combination in Vision
Anja Belz, Adrian Muscat and Brandon Birmingham: Exploring Different Preposition Sets, Models and Feature Sets in Automatic Generation of Spatial Image Descriptions
Desmond Elliott, Stella Frank, Khalil Sima'an and Lucia Specia: Multi30K: Multilingual English-German Image Descriptions
Ionut Sorodoc, Angeliki Lazaridou, Gemma Boleda, Aurélie Herbelot, Sandro Pezzelle and Raffaella Bernardi: ``Look, some green circles!'': Learning to quantify from images
Alexander Mehler, Tolga Uslu and Wahed Hemati: text2voronoi An Image-driven Approach to Differential Diagnosis
Olivia Winn, Madhavan Kavanur Kidambi and Smaranda Muresan: Detecting Visually Relevant Sentences for Fine-Grained Classification
Early Registration: Through 11:59PM EDT July 15, 2016
Late Registration: July 16, 2016 to 11:59PM EDT July 31, 2016
Onsite Registration: Begins August 7, 2016
|ACL 2016 registration fees|
|1-day (with main conference):|
|1-day (without main conference):|