367 research outputs found
Learning from Multiple Sources for Video Summarisation
Many visual surveillance tasks, e.g.video summarisation, is conventionally
accomplished through analysing imagerybased features. Relying solely on visual
cues for public surveillance video understanding is unreliable, since visual
observations obtained from public space CCTV video data are often not
sufficiently trustworthy and events of interest can be subtle. On the other
hand, non-visual data sources such as weather reports and traffic sensory
signals are readily accessible but are not explored jointly to complement
visual data for video content analysis and summarisation. In this paper, we
present a novel unsupervised framework to learn jointly from both visual and
independently-drawn non-visual data sources for discovering meaningful latent
structure of surveillance video data. In particular, we investigate ways to
cope with discrepant dimension and representation whist associating these
heterogeneous data sources, and derive effective mechanism to tolerate with
missing and incomplete data from different sources. We show that the proposed
multi-source learning framework not only achieves better video content
clustering than state-of-the-art methods, but also is capable of accurately
inferring missing non-visual semantics from previously unseen videos. In
addition, a comprehensive user study is conducted to validate the quality of
video summarisation generated using the proposed multi-source model
COMPENDIUM: a text summarisation tool for generating summaries of multiple purposes, domains, and genres
In this paper, we present a Text Summarisation tool, compendium, capable of generating the most common types of summaries. Regarding the input, single- and multi-document summaries can be produced; as the output, the summaries can be extractive or abstractive-oriented; and finally, concerning their purpose, the summaries can be generic, query-focused, or sentiment-based. The proposed architecture for compendium is divided in various stages, making a distinction between core and additional stages. The former constitute the backbone of the tool and are common for the generation of any type of summary, whereas the latter are used for enhancing the capabilities of the tool. The main contributions of compendium with respect to the state-of-the-art summarisation systems are that (i) it specifically deals with the problem of redundancy, by means of textual entailment; (ii) it combines statistical and cognitive-based techniques for determining relevant content; and (iii) it proposes an abstractive-oriented approach for facing the challenge of abstractive summarisation. The evaluation performed in different domains and textual genres, comprising traditional texts, as well as texts extracted from the Web 2.0, shows that compendium is very competitive and appropriate to be used as a tool for generating summaries.This research has been supported by the project “Desarrollo de Técnicas Inteligentes e Interactivas de Minería de Textos” (PROMETEO/2009/119) and the project reference ACOMP/2011/001 from the Valencian Government, as well as by the Spanish Government (grant no. TIN2009-13391-C04-01)
Semantics-based selection of everyday concepts in visual lifelogging
Concept-based indexing, based on identifying various semantic concepts appearing in multimedia, is an attractive option for multimedia retrieval and much research tries to bridge the semantic gap between the media’s low-level features and high-level semantics. Research into concept-based multimedia retrieval has generally focused on detecting concepts from high quality media such as broadcast TV or movies, but it is not well addressed in other domains like lifelogging where the original data is captured with poorer quality. We argue that in noisy domains such as lifelogging, the management of data needs to include semantic reasoning in order to deduce a set of concepts to represent lifelog content for applications like searching, browsing or summarisation. Using semantic concepts to manage lifelog data relies on the fusion of automatically-detected concepts to provide a better understanding of the lifelog data. In this paper, we investigate the selection of semantic concepts for lifelogging which includes reasoning on semantic networks using a density-based approach. In a series of experiments we compare different semantic reasoning approaches and the experimental evaluations we report on lifelog data show the efficacy of our approach
Review and classification of trajectory summarisation algorithms: From compression to segmentation
With the continuous development and cost reduction of positioning and tracking technologies, a large amount of trajectories are being exploited in multiple domains for knowledge extraction. A trajectory is formed by a large number of measurements, where many of them are unnecessary to describe the actual trajectory of the vehicle, or even harmful due to sensor noise. This not only consumes large amounts of memory, but also makes the extracting knowledge process more difficult. Trajectory summarisation techniques can solve this problem, generating a smaller and more manageable representation and even semantic segments. In this comprehensive review, we explain and classify techniques for the summarisation of trajectories according to their search strategy and point evaluation criteria, describing connections with the line simplification problem. We also explain several special concepts in trajectory summarisation problem. Finally, we outline the recent trends and best practices to continue the research in next summarisation algorithms.The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was funded by public research projects of Spanish Ministry of Economy and Competitivity (MINECO), reference TEC2017-88048-C2-2-
An Outlook into the Future of Egocentric Vision
What will the future be? We wonder! In this survey, we explore the gap
between current research in egocentric vision and the ever-anticipated future,
where wearable computing, with outward facing cameras and digital overlays, is
expected to be integrated in our every day lives. To understand this gap, the
article starts by envisaging the future through character-based stories,
showcasing through examples the limitations of current technology. We then
provide a mapping between this future and previously defined research tasks.
For each task, we survey its seminal works, current state-of-the-art
methodologies and available datasets, then reflect on shortcomings that limit
its applicability to future research. Note that this survey focuses on software
models for egocentric vision, independent of any specific hardware. The paper
concludes with recommendations for areas of immediate explorations so as to
unlock our path to the future always-on, personalised and life-enhancing
egocentric vision.Comment: We invite comments, suggestions and corrections here:
https://openreview.net/forum?id=V3974SUk1
Graph Convolutional Neural Networks with Diverse Negative Samples via Decomposed Determinant Point Processes
Graph convolutional networks (GCNs) have achieved great success in graph
representation learning by extracting high-level features from nodes and their
topology. Since GCNs generally follow a message-passing mechanism, each node
aggregates information from its first-order neighbour to update its
representation. As a result, the representations of nodes with edges between
them should be positively correlated and thus can be considered positive
samples. However, there are more non-neighbour nodes in the whole graph, which
provide diverse and useful information for the representation update. Two
non-adjacent nodes usually have different representations, which can be seen as
negative samples. Besides the node representations, the structural information
of the graph is also crucial for learning. In this paper, we used
quality-diversity decomposition in determinant point processes (DPP) to obtain
diverse negative samples. When defining a distribution on diverse subsets of
all non-neighbouring nodes, we incorporate both graph structure information and
node representations. Since the DPP sampling process requires matrix eigenvalue
decomposition, we propose a new shortest-path-base method to improve
computational efficiency. Finally, we incorporate the obtained negative samples
into the graph convolution operation. The ideas are evaluated empirically in
experiments on node classification tasks. These experiments show that the newly
proposed methods not only improve the overall performance of standard
representation learning but also significantly alleviate over-smoothing
problems.Comment: Accepted by IEEE TNNLS on 30-Aug-2023. arXiv admin note: text overlap
with arXiv:2210.0072
- …