105 research outputs found
Balancing clusters to reduce response time variability in large scale image search
Many algorithms for approximate nearest neighbor search in high-dimensional
spaces partition the data into clusters. At query time, in order to avoid
exhaustive search, an index selects the few (or a single) clusters nearest to
the query point. Clusters are often produced by the well-known -means
approach since it has several desirable properties. On the downside, it tends
to produce clusters having quite different cardinalities. Imbalanced clusters
negatively impact both the variance and the expectation of query response
times. This paper proposes to modify -means centroids to produce clusters
with more comparable sizes without sacrificing the desirable properties.
Experiments with a large scale collection of image descriptors show that our
algorithm significantly reduces the variance of response times without
seriously impacting the search quality
Searching in one billion vectors: re-rank with source coding
International audienceRecent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation
Time-Sensitive Topic Models for Action Recognition in Videos
In this paper, we postulate that temporal information is important for action recognition in videos. Keeping temporal information, videos are represented as wordĂtime documents. We propose to use time-sensitive probabilistic topic models and we extend them for the con-text of supervised learning. Our time-sensitive approach is com-pared to both PLSA and Bag-of-Words. Our approach is shown to both capture semantics from data and yield classification perfor-mance comparable to other methods, outperforming them when the amount of training data is low. 1
Match-And-Deform: Time Series Domain Adaptation through Optimal Transport and Temporal Alignment
While large volumes of unlabeled data are usually available, associated
labels are often scarce. The unsupervised domain adaptation problem aims at
exploiting labels from a source domain to classify data from a related, yet
different, target domain. When time series are at stake, new difficulties arise
as temporal shifts may appear in addition to the standard feature distribution
shift. In this paper, we introduce the Match-And-Deform (MAD) approach that
aims at finding correspondences between the source and target time series while
allowing temporal distortions. The associated optimization problem
simultaneously aligns the series thanks to an optimal transport loss and the
time stamps through dynamic time warping. When embedded into a deep neural
network, MAD helps learning new representations of time series that both align
the domains and maximize the discriminative power of the network. Empirical
studies on benchmark datasets and remote sensing data demonstrate that MAD
makes meaningful sample-to-sample pairing and time shift estimation, reaching
similar or better classification performance than state-of-the-art deep time
series domain adaptation strategies
A hybrid approach to classification with shapelets
Shapelets are phase independent subseries that can be used to discriminate between time series. Shapelets have proved to be very effective primitives for time series classification. The two most prominent shapelet based classification algorithms are the shapelet transform (ST) and learned shapelets (LS). One significant difference between these approaches is that ST is data driven, whereas LS searches the entire shapelet space through stochastic gradient descent. The weakness of the former is that full enumeration of possible shapelets is very time consuming. The problem with the latter is that it is very dependent on the initialisation of the shapelets. We propose hybridising the two approaches through a pipeline that includes a time constrained data driven shapelet search which is then passed to a neural network architecture of learned shapelets for tuning. The tuned shapelets are extracted and formed into a transform, which is then classified with a rotation forest. We show that this hybrid approach is significantly better than either approach in isolation, and that the resulting classifier is not significantly worse than a full shapelet search
T-Patterns Revisited: Mining for Temporal Patterns in Sensor Data
The trend to use large amounts of simple sensors as opposed to a few complex sensors to monitor places and systems creates a need for temporal pattern mining algorithms to work on such data. The methods that try to discover re-usable and interpretable patterns in temporal event data have several shortcomings. We contrast several recent approaches to the problem, and extend the T-Pattern algorithm, which was previously applied for detection of sequential patterns in behavioural sciences. The temporal complexity of the T-pattern approach is prohibitive in the scenarios we consider. We remedy this with a statistical model to obtain a fast and robust algorithm to find patterns in temporal data. We test our algorithm on a recent database collected with passive infrared sensors with millions of events
On Time Series Classification with Dictionary-Based Classifiers
A family of algorithms for time series classification (TSC) involve running a sliding window across each series, discretising the window to form a word, forming a histogram of word counts over the dictionary, then constructing a classifier on the histograms. A recent evaluation of two of this type of algorithm, Bag of Patterns (BOP) and Bag of Symbolic Fourier Approximation Symbols (BOSS) found a significant difference in accuracy between these seemingly similar algorithms. We investigate this phenomenon by deconstructing the classifiers and measuring the relative importance of the four key components between BOP and BOSS. We find that whilst ensembling is a key component for both algorithms, the effect of the other components is mixed and more complex. We conclude that BOSS represents the state of the art for dictionary-based TSC. Both BOP and BOSS can be classed as bag of words approaches. These are particularly popular in Computer Vision for tasks such as image classification. We adapt three techniques used in Computer Vision for TSC: Scale Invariant Feature Transform; Spatial Pyramids; and Histogram Intersection. We find that using Spatial Pyramids in conjunction with BOSS (SP) produces a significantly more accurate classifier. SP is significantly more accurate than standard benchmarks and the original BOSS algorithm. It is not significantly worse than the best shapelet-based or deep learning approaches, and is only outperformed by an ensemble that includes BOSS as a constituent module
Indexation de séquences de descripteurs
Getting information from multimedia documents is a very important field of research. We can now process huge image databases and do effective content-based searches. Processing more complex documents such as video or audio streams appears to be the next step in the development of content-based search tools. Video and audio streams are different as they embed the notion of descriptors sequences, in which the order between described elements is a key. This thesis proposes two indexing methods for temporal multimedia documents. The first one is based on the use of the Dynamic Time Warping (DTW) algorithm to compare sequences. This thesis introduces a method that presents significant improvement in terms of response time when compared to already existing methods. The second one is specifically applied to cover song detection. It consists in a first filtering stage of temporal regions of the database that are possible matches for elements of the query song and a second robustification stage that ensures temporal consistency.L'exploitation de documents multimĂ©dia est en plein essor. Nous savons maintenant bien exploiter de trĂšs grandes bases d'images photographiques et y faire des recherches par le contenu efficaces. L'Ă©tape suivante consiste Ă se tourner vers des documents plus complexes, comme le sont les vidĂ©os et les bandes sonores. Une des principales difficultĂ©s affĂ©rentes au traitement de tels documents vient de leur caractĂšre temporel. DĂ©crire de l'audio et de la vidĂ©o revient ainsi Ă fabriquer des sĂ©quences de descriptions dont il est important de prĂ©server l'ordre et l'enchaĂźnement. Cette thĂšse propose deux mĂ©thodes d'indexation de documents multimĂ©dia sĂ©quentiels. La premiĂšre se base sur l'utilisation de l'alignement dynamique (DTW) pour la comparaison de sĂ©quences et propose une mĂ©thode prĂ©sentant des gains significatifs en termes de coĂ»t de calcul par rapport aux mĂ©thodes existantes. La seconde mĂ©thode est appliquĂ©e spĂ©cifiquement Ă la recherche de reprises musicales. Il s'agit d'effectuer un premier filtrage des rĂ©gions temporelles susceptibles d'ĂȘtre mises en correspondance avec la requĂȘte, avant d'appliquer une robustification temporelle
- âŠ