101,747 research outputs found
Diachronic cross-modal embeddings
This work has been partially funded by the CMU Portugal research project GoLocal Ref. CMUP-ERI/TIC/0046/2014, by the H2020 ICT project COGNITUS with the grant agreement no 687605 and by the FCT project NOVA LINCS Ref. UID/CEC/04516/2019. We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPUs used for this research.Understanding the semantic shifts of multimodal information is only possible with models that capture cross-modal interactions over time. Under this paradigm, a new embedding is needed that structures visual-textual interactions according to the temporal dimension, thus, preserving data's original temporal organisation. This paper introduces a novel diachronic cross-modal embedding (DCM), where cross-modal correlations are represented in embedding space, throughout the temporal dimension, preserving semantic similarity at each instant t. To achieve this, we trained a neural cross-modal architecture, under a novel ranking loss strategy, that for each multimodal instance, enforces neighbour instances' temporal alignment, through subspace structuring constraints based on a temporal alignment window. Experimental results show that our DCM embedding successfully organises instances over time. Quantitative experiments, confirm that DCM is able to preserve semantic cross-modal correlations at each instant t while also providing better alignment capabilities. Qualitative experiments unveil new ways to browse multimodal content and hint that multimodal understanding tasks can benefit from this new embedding.publishersversionpublishe
Generalized Rank Pooling for Activity Recognition
Most popular deep models for action recognition split video sequences into
short sub-sequences consisting of a few frames; frame-based features are then
pooled for recognizing the activity. Usually, this pooling step discards the
temporal order of the frames, which could otherwise be used for better
recognition. Towards this end, we propose a novel pooling method, generalized
rank pooling (GRP), that takes as input, features from the intermediate layers
of a CNN that is trained on tiny sub-sequences, and produces as output the
parameters of a subspace which (i) provides a low-rank approximation to the
features and (ii) preserves their temporal order. We propose to use these
parameters as a compact representation for the video sequence, which is then
used in a classification setup. We formulate an objective for computing this
subspace as a Riemannian optimization problem on the Grassmann manifold, and
propose an efficient conjugate gradient scheme for solving it. Experiments on
several activity recognition datasets show that our scheme leads to
state-of-the-art performance.Comment: Accepted at IEEE International Conference on Computer Vision and
Pattern Recognition (CVPR), 201
Temporal Cross-Media Retrieval with Soft-Smoothing
Multimedia information have strong temporal correlations that shape the way
modalities co-occur over time. In this paper we study the dynamic nature of
multimedia and social-media information, where the temporal dimension emerges
as a strong source of evidence for learning the temporal correlations across
visual and textual modalities. So far, cross-media retrieval models, explored
the correlations between different modalities (e.g. text and image) to learn a
common subspace, in which semantically similar instances lie in the same
neighbourhood. Building on such knowledge, we propose a novel temporal
cross-media neural architecture, that departs from standard cross-media
methods, by explicitly accounting for the temporal dimension through temporal
subspace learning. The model is softly-constrained with temporal and
inter-modality constraints that guide the new subspace learning task by
favouring temporal correlations between semantically similar and temporally
close instances. Experiments on three distinct datasets show that accounting
for time turns out to be important for cross-media retrieval. Namely, the
proposed method outperforms a set of baselines on the task of temporal
cross-media retrieval, demonstrating its effectiveness for performing temporal
subspace learning.Comment: To appear in ACM MM 201
Using genetic algorithms to generate test sequences for complex timed systems
The generation of test data for state based specifications is a computationally expensive process. This problem is magnified if we consider that time con- straints have to be taken into account to govern the transitions of the studied system. The main goal of this paper is to introduce a complete methodology, sup- ported by tools, that addresses this issue by represent- ing the test data generation problem as an optimisa- tion problem. We use heuristics to generate test cases. In order to assess the suitability of our approach we consider two different case studies: a communication protocol and the scientific application BIPS3D. We give details concerning how the test case generation problem can be presented as a search problem and automated. Genetic algorithms (GAs) and random search are used to generate test data and evaluate the approach. GAs outperform random search and seem to scale well as the problem size increases. It is worth to mention that we use a very simple fitness function that can be eas- ily adapted to be used with other evolutionary search techniques
When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks
We introduce a framework for the modeling of sequential data capturing
pathways of varying lengths observed in a network. Such data are important,
e.g., when studying click streams in information networks, travel patterns in
transportation systems, information cascades in social networks, biological
pathways or time-stamped social interactions. While it is common to apply graph
analytics and network analysis to such data, recent works have shown that
temporal correlations can invalidate the results of such methods. This raises a
fundamental question: when is a network abstraction of sequential data
justified? Addressing this open question, we propose a framework which combines
Markov chains of multiple, higher orders into a multi-layer graphical model
that captures temporal correlations in pathways at multiple length scales
simultaneously. We develop a model selection technique to infer the optimal
number of layers of such a model and show that it outperforms previously used
Markov order detection techniques. An application to eight real-world data sets
on pathways and temporal networks shows that it allows to infer graphical
models which capture both topological and temporal characteristics of such
data. Our work highlights fallacies of network abstractions and provides a
principled answer to the open question when they are justified. Generalizing
network representations to multi-order graphical models, it opens perspectives
for new data mining and knowledge discovery algorithms.Comment: 10 pages, 4 figures, 1 table, companion python package pathpy
available on gitHu
Bipolarity in the querying of temporal databases
A database represents part of reality by containing data representing properties of real objects or concepts. To many real-world concepts or objects, time is an essential aspect and thus it should often be (implicitly) represented by databases, making these temporal databases. However, like other data, the time-related data in such databases may also contain imperfections such as uncertainties. One of the main purposes of a database is to allow the retrieval of information or knowledge deduced from its data, which is often done by querying the database. Because users may have both positive and negative preferences, they may want to query a database in a bipolar way. Moreover, their demands may have some temporal aspects. In this paper, a novel technique is presented, to query a valid-time relation containing uncertain valid-time data in a heterogeneously bipolar way, allowing every elementary query constraint a specific temporal constraint
- âŠ