20,488 research outputs found
Smartphone picture organization: a hierarchical approach
We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin
Cultural Event Recognition with Visual ConvNets and Temporal Models
This paper presents our contribution to the ChaLearn Challenge 2015 on
Cultural Event Classification. The challenge in this task is to automatically
classify images from 50 different cultural events. Our solution is based on the
combination of visual features extracted from convolutional neural networks
with temporal information using a hierarchical classifier scheme. We extract
visual features from the last three fully connected layers of both CaffeNet
(pretrained with ImageNet) and our fine tuned version for the ChaLearn
challenge. We propose a late fusion strategy that trains a separate low-level
SVM on each of the extracted neural codes. The class predictions of the
low-level SVMs form the input to a higher level SVM, which gives the final
event scores. We achieve our best result by adding a temporal refinement step
into our classification scheme, which is applied directly to the output of each
low-level SVM. Our approach penalizes high classification scores based on
visual features when their time stamp does not match well an event-specific
temporal distribution learned from the training and validation data. Our system
achieved the second best result in the ChaLearn Challenge 2015 on Cultural
Event Classification with a mean average precision of 0.767 on the test set.Comment: Initial version of the paper accepted at the CVPR Workshop ChaLearn
Looking at People 201
Family memories in the home: contrasting physical and digital mementos
We carried out fieldwork to characterise and compare physical and digital mementos in the home. Physical mementos are highly valued, heterogeneous and support different types of recollection. Contrary to expectations, we found physical mementos are not purely representational, and can involve appropriating common objects and more idiosyncratic forms. In contrast, digital mementos were initially perceived as less valuable, although participants later reconsidered this. Digital mementos were somewhat limited in function and expression, largely involving representational photos and videos, and infrequently accessed. We explain these digital limitations and conclude with design guidelines for digital mementos, including better techniques for accessing and integrating these into everyday life, allowing them to acquire the symbolic associations and lasting value that characterise their physical counterparts
Access to recorded interviews: A research agenda
Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
Beyond Frontal Faces: Improving Person Recognition Using Multiple Cues
We explore the task of recognizing peoples' identities in photo albums in an
unconstrained setting. To facilitate this, we introduce the new People In Photo
Albums (PIPA) dataset, consisting of over 60000 instances of 2000 individuals
collected from public Flickr photo albums. With only about half of the person
images containing a frontal face, the recognition task is very challenging due
to the large variations in pose, clothing, camera viewpoint, image resolution
and illumination. We propose the Pose Invariant PErson Recognition (PIPER)
method, which accumulates the cues of poselet-level person recognizers trained
by deep convolutional networks to discount for the pose variations, combined
with a face recognizer and a global recognizer. Experiments on three different
settings confirm that in our unconstrained setup PIPER significantly improves
on the performance of DeepFace, which is one of the best face recognizers as
measured on the LFW dataset
Person Recognition in Personal Photo Collections
Recognising persons in everyday photos presents major challenges (occluded
faces, different clothing, locations, etc.) for machine vision. We propose a
convnet based person recognition system on which we provide an in-depth
analysis of informativeness of different body cues, impact of training data,
and the common failure modes of the system. In addition, we discuss the
limitations of existing benchmarks and propose more challenging ones. Our
method is simple and is built on open source and open data, yet it improves the
state of the art results on a large dataset of social media photos (PIPA).Comment: Accepted to ICCV 2015, revise
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Towards memory supporting personal information management tools
In this article we discuss re-retrieving personal information objects and relate the task to recovering from lapse(s) in memory. We propose that fundamentally it is lapses in memory that impede users from successfully re-finding the information they need. Our hypothesis is that by learning more about memory lapses in non-computing contexts and how people cope and recover from these lapses, we can better inform the design of PIM tools and improve the user's ability to re-access and re-use objects. We describe a diary study that investigates the everyday memory problems of 25 people from a wide range of backgrounds. Based on the findings, we present a series of principles that we hypothesize will improve the design of personal information management tools. This hypothesis is validated by an evaluation of a tool for managing personal photographs, which was designed with respect to our findings. The evaluation suggests that users' performance when re-finding objects can be improved by building personal information management tools to support characteristics of human memory
- âŠ