Search CORE

2,029 research outputs found

Knowledge-rich Image Gist Understanding Beyond Literal Meaning

Author: Dietz Laura
Effelsberg Wolfgang
Hulpus Ioana
Ponzetto Simone Paolo
Weiland Lydia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

We investigate the problem of understanding the message (gist) conveyed by images and their captions as found, for instance, on websites or news articles. To this end, we propose a methodology to capture the meaning of image-caption pairs on the basis of large amounts of machine-readable knowledge that has previously been shown to be highly effective for text understanding. Our method identifies the connotation of objects beyond their denotation: where most approaches to image understanding focus on the denotation of objects, i.e., their literal meaning, our work addresses the identification of connotations, i.e., iconic meanings of objects, to understand the message of images. We view image understanding as the task of representing an image-caption pair on the basis of a wide-coverage vocabulary of concepts such as the one provided by Wikipedia, and cast gist detection as a concept-ranking problem with image-caption pairs as queries. To enable a thorough investigation of the problem of gist understanding, we produce a gold standard of over 300 image-caption pairs and over 8,000 gist annotations covering a wide variety of topics at different levels of abstraction. We use this dataset to experimentally benchmark the contribution of signals from heterogeneous sources, namely image and text. The best result with a Mean Average Precision (MAP) of 0.69 indicate that by combining both dimensions we are able to better understand the meaning of our image-caption pairs than when using language or vision information alone. We test the robustness of our gist detection approach when receiving automatically generated input, i.e., using automatically generated image tags or generated captions, and prove the feasibility of an end-to-end automated process

arXiv.org e-Print Archive

MAnnheim DOCument Server

Towards Understanding User Preferences from User Tagging Behavior for Personalization

Author: Chen Tshuan
Nwana Amandianeze O.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/11/2015
Field of study

Personalizing image tags is a relatively new and growing area of research, and in order to advance this research community, we must review and challenge the de-facto standard of defining tag importance. We believe that for greater progress to be made, we must go beyond tags that merely describe objects that are visually represented in the image, towards more user-centric and subjective notions such as emotion, sentiment, and preferences. We focus on the notion of user preferences and show that the order that users list tags on images is correlated to the order of preference over the tags that they provided for the image. While this observation is not completely surprising, to our knowledge, we are the first to explore this aspect of user tagging behavior systematically and report empirical results to support this observation. We argue that this observation can be exploited to help advance the image tagging (and related) communities. Our contributions include: 1.) conducting a user study demonstrating this observation, 2.) collecting a dataset with user tag preferences explicitly collected.Comment: 6 page

arXiv.org e-Print Archive

Crossref

Developing a Formal Model for Mind Maps

Author: Papatheodorou Christos
Siochos Vasilis
Publication venue
Publication date: 01/03/2011
Field of study

Mind map is a graphical technique, which is used to represent words, concepts, tasks or other connected items or arranged around central topic or idea. Mind maps are widely used, therefore exist plenty of software programs to create or edit them, while there is none format for the model representation, neither a standard format. This paper presents and effort to propose a formal mind map model aiming to describe the structure, content, semantics and social connections. The structure describes the basic mind map graph consisted of a node set, an edge set, a cloud set and a graphical connections set. The content includes the set of the texts and objects linked to the nodes. The social connections are the mind maps of other users, which form the neighborhood of the mind map owner in a social networking system. Finally, the mind map semantics is any true logic connection between mind map textual parts and a concept. Each of these elements of the model is formally described building the suggested mind map model. Its establishment will support the application of algorithms and methods towards their information extraction

E-LIS

Learning Contextualized Semantics from Co-occurring Terms via a Siamese Architecture

Author: Chen Ke
Sandouk Ubai
Publication venue
Publication date: 17/06/2015
Field of study

One of the biggest challenges in Multimedia information retrieval and understanding is to bridge the semantic gap by properly modeling concept semantics in context. The presence of out of vocabulary (OOV) concepts exacerbates this difficulty. To address the semantic gap issues, we formulate a problem on learning contextualized semantics from descriptive terms and propose a novel Siamese architecture to model the contextualized semantics from descriptive terms. By means of pattern aggregation and probabilistic topic models, our Siamese architecture captures contextualized semantics from the co-occurring descriptive terms via unsupervised learning, which leads to a concept embedding space of the terms in context. Furthermore, the co-occurring OOV concepts can be easily represented in the learnt concept embedding space. The main properties of the concept embedding space are demonstrated via visualization. Using various settings in semantic priming, we have carried out a thorough evaluation by comparing our approach to a number of state-of-the-art methods on six annotation corpora in different domains, i.e., MagTag5K, CAL500 and Million Song Dataset in the music domain as well as Corel5K, LabelMe and SUNDatabase in the image domain. Experimental results on semantic priming suggest that our approach outperforms those state-of-the-art methods considerably in various aspects

arXiv.org e-Print Archive

Crossref

The University of Manchester - Institutional Repository