Search CORE

7,782 research outputs found

Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data

Author: Bianchi Filippo Maria
Jenssen Robert
Mikalsen Karl Øyvind
Soguero-Ruiz Cristina
Publication venue
Publication date: 01/01/2017
Field of study

Similarity-based approaches represent a promising direction for time series analysis. However, many such methods rely on parameter tuning, and some have shortcomings if the time series are multivariate (MTS), due to dependencies between attributes, or the time series contain missing data. In this paper, we address these challenges within the powerful context of kernel methods by proposing the robust \emph{time series cluster kernel} (TCK). The approach taken leverages the missing data handling properties of Gaussian mixture models (GMM) augmented with informative prior distributions. An ensemble learning approach is exploited to ensure robustness to parameters by combining the clustering results of many GMM to form the final kernel. We evaluate the TCK on synthetic and real data and compare to other state-of-the-art techniques. The experimental results demonstrate that the TCK is robust to parameter choices, provides competitive results for MTS without missing data and outstanding results for missing data.Comment: 23 pages, 6 figure

arXiv.org e-Print Archive

Munin - Open Research Archive

NORA - Norwegian Open Research Archives

A Run Length Transformation for Discriminating Between Auto Regressive Time Series

Author: A BAGNALL
A BAGNALL
A DOUZAL-CHOUAKRIA
Anthony Bagnall
AW BOJANCZYK
B WYK VAN
CK PENG
D Piccolo
EA MAHARAJ
EA MAHARAJ
EA MAHARAJ
Gareth Janacek
H DENG
J Durbin
J RODRIGUEZ
JC GOWER
M CORDUAS
S HE
S QAIRUNNISA
T COX
TW LIAO
Y JEONG
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2014
Field of study

Crossref

University of East Anglia digital repository

Nonparametric Bayesian multiple testing for longitudinal performance stratification

Author: Scott James G.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 29/09/2010
Field of study

This paper describes a framework for flexible multiple hypothesis testing of autoregressive time series. The modeling approach is Bayesian, though a blend of frequentist and Bayesian reasoning is used to evaluate procedures. Nonparametric characterizations of both the null and alternative hypotheses will be shown to be the key robustification step necessary to ensure reasonable Type-I error performance. The methodology is applied to part of a large database containing up to 50 years of corporate performance statistics on 24,157 publicly traded American companies, where the primary goal of the analysis is to flag companies whose historical performance is significantly different from that expected due to chance.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS252 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Consensus clustering and functional interpretation of gene-expression data

Author: Kellam P.
Liu X.
Martin Nigel
Orengo C.A.
Swift S.
Tucker A.
Vinciotti V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas

Springer - Publisher Connector

UCL Discovery

PubMed Central

Birkbeck Institutional Research Online

Spiral - Imperial College Digital Repository

Brunel University Research Archive

Clustering of discretely observed diffusion processes

Author: De Gregorio Alessandro
Iacus Stefano Maria
Publication venue
Publication date: 01/01/2008
Field of study

In this paper a new dissimilarity measure to identify groups of assets dynamics is proposed. The underlying generating process is assumed to be a diffusion process solution of stochastic differential equations and observed at discrete time. The mesh of observations is not required to shrink to zero. As distance between two observed paths, the quadratic distance of the corresponding estimated Markov operators is considered. Analysis of both synthetic data and real financial data from NYSE/NASDAQ stocks, give evidence that this distance seems capable to catch differences in both the drift and diffusion coefficients contrary to other commonly used metrics

arXiv.org e-Print Archive

AIR Universita degli studi di Milano