Search CORE

HAL Descartes

Hal-Diderot

arXiv.org e-Print Archive

Searching in one billion vectors: re-rank with source coding

Author: Amsaleg Laurent
Douze Matthijs
Jégou Hervé
Tavenard Romain
Publication venue: IEEE
Publication date: 01/01/2011
Field of study

International audienceRecent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation

HAL-CentraleSupelec

CiteSeerX

Hal - Université Grenoble Alpes

Infoscience - École polytechnique fédérale de Lausanne

Time-Sensitive Topic Models for Action Recognition in Videos

Author: Emonet Remi
Odobez Jean-Marc
Tavenard Romain
Publication venue
Publication date: 19/12/2013
Field of study

In this paper, we postulate that temporal information is important for action recognition in videos. Keeping temporal information, videos are represented as word×time documents. We propose to use time-sensitive probabilistic topic models and we extend them for the con-text of supervised learning. Our time-sensitive approach is com-pared to both PLSA and Bag-of-Words. Our approach is shown to both capture semantics from data and yield classification perfor-mance comparable to other methods, outperforming them when the amount of training data is low. 1

CiteSeerX

Infoscience - École polytechnique fédérale de Lausanne

Investigating time-sensitive topic model approaches for action recognition

Author: Emonet Remi
Odobez Jean-Marc
Tavenard Romain
Publication venue: Idiap
Publication date: 19/12/2013
Field of study

Match-And-Deform: Time Series Domain Adaptation through Optimal Transport and Temporal Alignment

Author: Chapel Laetitia
Courty Nicolas
Friguet Chloé
Painblanc François
Pelletier Charlotte
Tavenard Romain
Publication venue
Publication date: 25/08/2023
Field of study

While large volumes of unlabeled data are usually available, associated labels are often scarce. The unsupervised domain adaptation problem aims at exploiting labels from a source domain to classify data from a related, yet different, target domain. When time series are at stake, new difficulties arise as temporal shifts may appear in addition to the standard feature distribution shift. In this paper, we introduce the Match-And-Deform (MAD) approach that aims at finding correspondences between the source and target time series while allowing temporal distortions. The associated optimization problem simultaneously aligns the series thanks to an optimal transport loss and the time stamps through dynamic time warping. When embedded into a deep neural network, MAD helps learning new representations of time series that both align the domains and maximize the discriminative power of the network. Empirical studies on benchmark datasets and remote sensing data demonstrate that MAD makes meaningful sample-to-sample pairing and time shift estimation, reaching similar or better classification performance than state-of-the-art deep time series domain adaptation strategies

arXiv.org e-Print Archive

A hybrid approach to classification with shapelets

Author: Bagnall Anthony
Guijo-Rubio David
Gutiérrez Pedro A.
Tavenard Romain
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Shapelets are phase independent subseries that can be used to discriminate between time series. Shapelets have proved to be very effective primitives for time series classification. The two most prominent shapelet based classification algorithms are the shapelet transform (ST) and learned shapelets (LS). One significant difference between these approaches is that ST is data driven, whereas LS searches the entire shapelet space through stochastic gradient descent. The weakness of the former is that full enumeration of possible shapelets is very time consuming. The problem with the latter is that it is very dependent on the initialisation of the shapelets. We propose hybridising the two approaches through a pipeline that includes a time constrained data driven shapelet search which is then passed to a neural network architecture of learned shapelets for tuning. The tuned shapelets are extracted and formed into a transform, which is then classified with a rotation forest. We show that this hybrid approach is significantly better than either approach in isolation, and that the resulting classifier is not significantly worse than a full shapelet search

HAL-CentraleSupelec

HAL - Normandie Université

HAL-Université de Bretagne Occidentale

HAL Descartes

University of East Anglia digital repository

Hal-Diderot

Multidisciplinary Digital Publishing Institute

T-Patterns Revisited: Mining for Temporal Patterns in Sensor Data

Author: Albert Ali Salah
Bobick
Borrie
Caris-Verhallen
Casari
Choset
Connolly
Cook
Dietterich
Duncan
Eagle
Elman
Eric Pauwels
Ermes
Fox
Han
Ivanov
Klemettinen
Last
Magnusson
Mannila
Micheloni
Rao
Romain Tavenard
Tavenard
Theo Gevers
Tseng
Tseng
Wada
Wren
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 01/01/2010
Field of study

The trend to use large amounts of simple sensors as opposed to a few complex sensors to monitor places and systems creates a need for temporal pattern mining algorithms to work on such data. The methods that try to discover re-usable and interpretable patterns in temporal event data have several shortcomings. We contrast several recent approaches to the problem, and extend the T-Pattern algorithm, which was previously applied for detection of sequential patterns in behavioural sciences. The temporal complexity of the T-pattern approach is prohibitive in the scenarios we consider. We remedy this with a statistical model to obtain a fast and robust algorithm to find patterns in temporal data. We test our algorithm on a recent database collected with passive infrared sensors with millions of events

Directory of Open Access Journals

HAL Descartes

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Hal-Diderot

CWI's Institutional Repository

PubMed Central

On Time Series Classification with Dictionary-Based Classifiers

Author: Anthony Bagnall
Bagnall
Bagnall
Bagnall
Baydogan
Baydogan
Benavoli
Bostrom
Corduas
Demšar
Deng
Fulcher
García
Grabocka
Hills
James Large
Kate
Lin
Lines
Lowe
Romain Tavenard
Schäfer
Schäfer
Simon Malinowski
Ye
Zhao
Publication venue: 'IOS Press'
Publication date: 24/10/2019
Field of study

A family of algorithms for time series classification (TSC) involve running a sliding window across each series, discretising the window to form a word, forming a histogram of word counts over the dictionary, then constructing a classifier on the histograms. A recent evaluation of two of this type of algorithm, Bag of Patterns (BOP) and Bag of Symbolic Fourier Approximation Symbols (BOSS) found a significant difference in accuracy between these seemingly similar algorithms. We investigate this phenomenon by deconstructing the classifiers and measuring the relative importance of the four key components between BOP and BOSS. We find that whilst ensembling is a key component for both algorithms, the effect of the other components is mixed and more complex. We conclude that BOSS represents the state of the art for dictionary-based TSC. Both BOP and BOSS can be classed as bag of words approaches. These are particularly popular in Computer Vision for tasks such as image classification. We adapt three techniques used in Computer Vision for TSC: Scale Invariant Feature Transform; Spatial Pyramids; and Histogram Intersection. We find that using Spatial Pyramids in conjunction with BOSS (SP) produces a significantly more accurate classifier. SP is significantly more accurate than standard benchmarks and the original BOSS algorithm. It is not significantly worse than the best shapelet-based or deep learning approaches, and is only outperformed by an ensemble that includes BOSS as a constituent module

University of East Anglia digital repository

Indexation de séquences de descripteurs

Author: Tavenard Romain
Publication venue: HAL CCSD
Publication date: 04/07/2011
Field of study

Getting information from multimedia documents is a very important field of research. We can now process huge image databases and do effective content-based searches. Processing more complex documents such as video or audio streams appears to be the next step in the development of content-based search tools. Video and audio streams are different as they embed the notion of descriptors sequences, in which the order between described elements is a key. This thesis proposes two indexing methods for temporal multimedia documents. The first one is based on the use of the Dynamic Time Warping (DTW) algorithm to compare sequences. This thesis introduces a method that presents significant improvement in terms of response time when compared to already existing methods. The second one is specifically applied to cover song detection. It consists in a first filtering stage of temporal regions of the database that are possible matches for elements of the query song and a second robustification stage that ensures temporal consistency.L'exploitation de documents multimédia est en plein essor. Nous savons maintenant bien exploiter de très grandes bases d'images photographiques et y faire des recherches par le contenu efficaces. L'étape suivante consiste à se tourner vers des documents plus complexes, comme le sont les vidéos et les bandes sonores. Une des principales difficultés afférentes au traitement de tels documents vient de leur caractère temporel. Décrire de l'audio et de la vidéo revient ainsi à fabriquer des séquences de descriptions dont il est important de préserver l'ordre et l'enchaînement. Cette thèse propose deux méthodes d'indexation de documents multimédia séquentiels. La première se base sur l'utilisation de l'alignement dynamique (DTW) pour la comparaison de séquences et propose une méthode présentant des gains significatifs en termes de coût de calcul par rapport aux méthodes existantes. La seconde méthode est appliquée spécifiquement à la recherche de reprises musicales. Il s'agit d'effectuer un premier filtrage des régions temporelles susceptibles d'être mises en correspondance avec la requête, avant d'appliquer une robustification temporelle