38 research outputs found
Automatic Prediction of Building Age from Photographs
We present a first method for the automated age estimation of buildings from
unconstrained photographs. To this end, we propose a two-stage approach that
firstly learns characteristic visual patterns for different building epochs at
patch-level and then globally aggregates patch-level age estimates over the
building. We compile evaluation datasets from different sources and perform an
detailed evaluation of our approach, its sensitivity to parameters, and the
capabilities of the employed deep networks to learn characteristic visual
age-related patterns. Results show that our approach is able to estimate
building age at a surprisingly high level that even outperforms human
evaluators and thereby sets a new performance baseline. This work represents a
first step towards the automated assessment of building parameters for
automated price prediction.Comment: Preprint of paper to appear in ACM International Conference on
Multimedia Retrieval (ICMR) 2018 Conferenc
Considering documents in lifelog information retrieval
Lifelogging is a research topic that is receiving increasing attention and although lifelog research has progressed in recent years, the concept of what represents a document in lifelog retrieval has not yet been sufficiently explored. Hence, the generation of multimodal lifelog documents is a fundamental concept that must be addressed. In this paper, I introduce my general perspective on generating documents in lifelogging and reflect on learnings from collecting multimodal lifelog data from a number of participants in a study on lifelog data organization. In addition, the main motivation be- hind document generation is proposed and the challenges faced while collecting data and generating documents are discussed in detail. Finally, a process for organizing the documents in lifelog data retrieval is proposed, which I intend to follow in my PhD research
Annotating, Understanding, and Predicting Long-term Video Memorability
International audienceMemorability can be regarded as a useful metric of video importance to help make a choice between competing videos. Research on computational understanding of video memorability is however in its early stages. There is no available dataset for modelling purposes, and the few previous attempts provided protocols to collect video memorability data that would be difficult to generalize. Furthermore, the computational features needed to build a robust memorability predictor remain largely undiscovered. In this article, we propose a new protocol to collect long-term video memorability annotations. We measure the memory performances of 104 participants from weeks to years after memorization to build a dataset of 660 videos for video memorability prediction. This dataset is made available for the research community. We then analyze the collected data in order to better understand video memorability, in particular the effects of response time, duration of memory retention and repetition of visualization on video memorability. We finally investigate the use of various types of audio and visual features and build a computational model for video memorability prediction. We conclude that high level visual semantics help better predict the memorability of videos
Exquisitor at the Lifelog Search Challenge 2020
We present an enhanced version of Exquisitor, our interactive and scalable media exploration system. At its core, Exquisitor is an interactive learning system using relevance feedback on media items to build a model of the users' information need. Relying on efficient media representation and indexing, it facilitates real-time user interaction. The new features for the Lifelog Search Challenge 2020 include support for timeline browsing, search functionality for finding positive examples, and significant interface improvements. Participation in the Lifelog Search Challenge allows us to compare our paradigm, relying predominantly on interactive learning, with more traditional search-based multimedia retrieval systems
Ranking News-Quality Multimedia
News editors need to find the photos that best illustrate a news piece and
fulfill news-media quality standards, while being pressed to also find the most
recent photos of live events. Recently, it became common to use social-media
content in the context of news media for its unique value in terms of immediacy
and quality. Consequently, the amount of images to be considered and filtered
through is now too much to be handled by a person. To aid the news editor in
this process, we propose a framework designed to deliver high-quality,
news-press type photos to the user. The framework, composed of two parts, is
based on a ranking algorithm tuned to rank professional media highly and a
visual SPAM detection module designed to filter-out low-quality media. The core
ranking algorithm is leveraged by aesthetic, social and deep-learning semantic
features. Evaluation showed that the proposed framework is effective at finding
high-quality photos (true-positive rate) achieving a retrieval MAP of 64.5% and
a classification precision of 70%.Comment: To appear in ICMR'1
VRLE: Lifelog Interaction Prototype in Virtual Reality:Lifelog Search Challenge at ACM ICMR 2020
The Lifelog Search Challenge (LSC) invites researchers to share
their prototypes for interactive lifelog retrieval and encourages
competition to develop and evaluate effective methodologies to
achieve this. With this paper we present a novel approach to visual
lifelog exploration based on our research to date utilising virtual
reality as a medium for interactive information retrieval. The VRLE
prototype presented is an iteration on a previous system which
won the first LSC competition at ACM ICMR 2018