2,337 research outputs found
Smartphone picture organization: a hierarchical approach
We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin
Discovering Organizational Correlations from Twitter
Organizational relationships are usually very complex in real life. It is
difficult or impossible to directly measure such correlations among different
organizations, because important information is usually not publicly available
(e.g., the correlations of terrorist organizations). Nowadays, an increasing
amount of organizational information can be posted online by individuals and
spread instantly through Twitter. Such information can be crucial for detecting
organizational correlations. In this paper, we study the problem of discovering
correlations among organizations from Twitter. Mining organizational
correlations is a very challenging task due to the following reasons: a) Data
in Twitter occurs as large volumes of mixed information. The most relevant
information about organizations is often buried. Thus, the organizational
correlations can be scattered in multiple places, represented by different
forms; b) Making use of information from Twitter collectively and judiciously
is difficult because of the multiple representations of organizational
correlations that are extracted. In order to address these issues, we propose
multi-CG (multiple Correlation Graphs based model), an unsupervised framework
that can learn a consensus of correlations among organizations based on
multiple representations extracted from Twitter, which is more accurate and
robust than correlations based on a single representation. Empirical study
shows that the consensus graph extracted from Twitter can capture the
organizational correlations effectively.Comment: 11 pages, 4 figure
Clustering Time Series from Mixture Polynomial Models with Discretised Data
Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low
Cross-task weakly supervised learning from instructional videos
In this paper we investigate learning visual models for the steps of ordinary
tasks using weak supervision via instructional narrations and an ordered list
of steps instead of strong supervision via temporal annotations. At the heart
of our approach is the observation that weakly supervised learning may be
easier if a model shares components while learning different steps: `pour egg'
should be trained jointly with other tasks involving `pour' and `egg'. We
formalize this in a component model for recognizing steps and a weakly
supervised learning framework that can learn this model under temporal
constraints from narration and the list of steps. Past data does not permit
systematic studying of sharing and so we also gather a new dataset, CrossTask,
aimed at assessing cross-task sharing. Our experiments demonstrate that sharing
across tasks improves performance, especially when done at the component level
and that our component model can parse previously unseen tasks by virtue of its
compositionality.Comment: 18 pages, 17 figures, to be published in proceedings of the CVPR,
201
Collaborative Summarization of Topic-Related Videos
Large collections of videos are grouped into clusters by a topic keyword,
such as Eiffel Tower or Surfing, with many important visual concepts repeating
across them. Such a topically close set of videos have mutual influence on each
other, which could be used to summarize one of them by exploiting information
from others in the set. We build on this intuition to develop a novel approach
to extract a summary that simultaneously captures both important
particularities arising in the given video, as well as, generalities identified
from the set of videos. The topic-related videos provide visual context to
identify the important parts of the video being summarized. We achieve this by
developing a collaborative sparse optimization method which can be efficiently
solved by a half-quadratic minimization algorithm. Our work builds upon the
idea of collaborative techniques from information retrieval and natural
language processing, which typically use the attributes of other similar
objects to predict the attribute of a given object. Experiments on two
challenging and diverse datasets well demonstrate the efficacy of our approach
over state-of-the-art methods.Comment: CVPR 201
- …