Search CORE

10 research outputs found

Period Estimation in Astronomical Time Series Using Slotted Correntropy

Author: Estévez Pablo A.
Huijse Pablo
Protopapas Pavlos
Príncipe José
Zegers Pablo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2011
Field of study

In this letter, we propose a method for period estimation in light curves from periodic variable stars using correntropy. Light curves are astronomical time series of stellar brightness over time, and are characterized as being noisy and unevenly sampled. We propose to use slotted time lags in order to estimate correntropy directly from irregularly sampled time series. A new information theoretic metric is proposed for discriminating among the peaks of the correntropy spectral density. The slotted correntropy method outperformed slotted correlation, string length, VarTools (Lomb-Scargle periodogram and Analysis of Variance), and SigSpec applications on a set of light curves drawn from the MACHO survey

arXiv.org e-Print Archive

Repositorio Académico de la Universidad de Chile

Online classification for time-domain astronomy

Author: Lo Kitty K.
Murphy Tara
Rebbapragada Umaa
Wagstaff Kiri
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/02/2014
Field of study

The advent of synoptic sky surveys has spurred the development of techniques for real-time classification of astronomical sources in order to ensure timely follow-up with appropriate instruments. Previous work has focused on algorithm selection or improved light curve representations, and naively convert light curves into structured feature sets without regard for the time span or phase of the light curves. In this paper, we highlight the violation of a fundamental machine learning assumption that occurs when archival light curves with long observational time spans are used to train classifiers that are applied to light curves with fewer observations. We propose two solutions to deal with the mismatch in the time spans of training and test light curves. The first is the use of classifier committees where each classifier is trained on light curves of different observational time spans. Only the committee member whose training set matches the test light curve time span is invoked for classification. The second solution uses hierarchical classifiers that are able to predict source types both individually and by sub-group, so that the user can trade-off an earlier, more robust classification with classification granularity. We test both methods using light curves from the MACHO survey, and demonstrate their usefulness in improving performance over similar methods that naively train on all available archival data.Comment: Astroinformatics workshop, IEEE International Conference on Data Mining 201

arXiv.org e-Print Archive

Crossref

Contributions to Time Series Classification: Meta-Learning and Explainability

Author: Abanda Elustondo Amaia
Publication venue
Publication date: 14/01/2022
Field of study

141 p.La presente tesis incluye 3 contribuciones de diferentes tipos al área de la clasificación supervisada de series temporales, un campo en auge por la cantidad de series temporales recolectadas día a día en una gran variedad en ámbitos. En este contexto, la cantidad de métodos disponibles para clasificar series temporales es cada vez más grande, siendo los clasificadores cada vez más competitivos y variados. De esta manera, la primera contribución de la tesis consiste en proponer una taxonomía de los clasificadores de series temporales basados en distancias, donde se hace una revisión exhaustiva de los métodos existentes y sus costes computacionales. Además, desde el punto de vista de un/a usuario/a no experto/a (incluso desde la de un/a experto/a), elegir un clasificador adecuado para un problema concreto es una tarea difícil. En la segunda contribución, por tanto, se aborda la recomendación de clasificadores de series temporales, para lo que usaremos un enfoque basado en el meta-aprendizaje. Por último, la tercera contribución consiste en proponer un método para explicar la predicción de los clasificadores de series temporales, en el que calculamos la relevancia de cada región de una serie en la predicción. Este método de explicación está basado en perturbaciones, para lo que consideraremos transformaciones específicas y realistas para las series temporales

Archivo Digital para la Docencia y la Investigación

Contributions to Time Series Classification: Meta-Learning and Explainability

Author: Abanda A.
Publication venue
Publication date: 16/11/2021
Field of study

This thesis includes 3 contributions of different types to the area of supervised time series classification, a growing field of research due to the amount of time series collected daily in a wide variety of domains. In this context, the number of methods available for classifying time series is increasing, and the classifiers are becoming more and more competitive and varied. Thus, the first contribution of the thesis consists of proposing a taxonomy of distance-based time series classifiers, where an exhaustive review of the existing methods and their computational costs is made. Moreover, from the point of view of a non-expert user (even from that of an expert), choosing a suitable classifier for a given problem is a difficult task. The second contribution, therefore, deals with the recommendation of time series classifiers, for which we will use a meta-learning approach. Finally, the third contribution consists of proposing a method to explain the prediction of time series classifiers, in which we calculate the relevance of each region of a series in the prediction. This method of explanation is based on perturbations, for which we will consider specific and realistic transformations for the time series.BES-2016-07689

BCAM's Institutional Repository Data

Recommended from our members

Fast, Scalable, and Accurate Algorithms for Time-Series Analysis

Author: Paparrizos Ioannis
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Time is a critical element for the understanding of natural processes (e.g., earthquakes and weather) or human-made artifacts (e.g., stock market and speech signals). The analysis of time series, the result of sequentially collecting observations of such processes and artifacts, is becoming increasingly prevalent across scientific and industrial applications. The extraction of non-trivial features (e.g., patterns, correlations, and trends) in time series is a critical step for devising effective time-series mining methods for real-world problems and the subject of active research for decades. In this dissertation, we address this fundamental problem by studying and presenting computational methods for efficient unsupervised learning of robust feature representations from time series. Our objective is to (i) simplify and unify the design of scalable and accurate time-series mining algorithms; and (ii) provide a set of readily available tools for effective time-series analysis. We focus on applications operating solely over time-series collections and on applications where the analysis of time series complements the analysis of other types of data, such as text and graphs. For applications operating solely over time-series collections, we propose a generic computational framework, GRAIL, to learn low-dimensional representations that natively preserve the invariances offered by a given time-series comparison method. GRAIL represents a departure from classic approaches in the time-series literature where representation methods are agnostic to the similarity function used in subsequent learning processes. GRAIL relies on the attractive idea that once we construct the data-to-data similarity matrix most time-series mining tasks can be trivially solved. To overcome scalability issues associated with approaches relying on such matrices, GRAIL exploits time-series clustering to construct a small set of landmark time series and learns representations to reduce the data-to-data matrix to a data-to-landmark points matrix. To demonstrate the effectiveness of GRAIL, we first present domain-independent, highly accurate, and scalable time-series clustering methods to facilitate exploration and summarization of time-series collections. Then, we show that GRAIL representations, when combined with suitable methods, significantly outperform, in terms of efficiency and accuracy, state-of-the-art methods in major time-series mining tasks, such as querying, clustering, classification, sampling, and visualization. Overall, GRAIL rises as a new primitive for highly accurate, yet scalable, time-series analysis. For applications where the analysis of time series complements the analysis of other types of data, such as text and graphs, we propose generic, simple, and lightweight methodologies to learn features from time-varying measurements. Such applications often organize operations over different types of data in a pipeline such that one operation provides input---in the form of feature vectors---to subsequent operations. To reason about the temporal patterns and trends in the underlying features, we need to (i) track the evolution of features over different time periods; and (ii) transform these time-varying features into actionable knowledge (e.g., forecasting an outcome). To address this challenging problem, we propose principled approaches to model time-varying features and study two large-scale, real-world, applications. Specifically, we first study the problem of predicting the impact of scientific concepts through temporal analysis of characteristics extracted from the metadata and full text of scientific articles. Then, we explore the promise of harnessing temporal patterns in behavioral signals extracted from web search engine logs for early detection of devastating diseases. In both applications, combinations of features with time-series relevant features yielded the greatest impact than any other indicator considered in our analysis. We believe that our simple methodology, along with the interesting domain-specific findings that our work revealed, will motivate new studies across different scientific and industrial settings

Columbia University Academic Commons

Kernels for Periodic Time Series Arising in Astronomy

Author: A.W. Moore
B. Schölkopf
C. Alcock
C.H. Papadimitriou
D.A. Howell
E.J. Keogh
I. Soszynski
J. Debosscher
K.W. Hodapp
L. Faccioli
L.G. Valiant
M. Geha
M. Vlachos
M.F. Balcan
P. Protopapas
P.A. Gorry
R. Luss
S. Sonnenburg
T. Adamek
T. Gärtner
T.K. Huang
Publication venue
Publication date: 01/01/2009
Field of study

We present a method for applying machine learning algorithms to the automatic classification of astronomy star surveys using time series of star brightness. Currently such classification requires a large amount of domain expert time. We show that a combination of phase invariant similarity and explicit features extracted from the time series provide domain expert level classification. To facilitate this application, we investigate the cross-correlation as a general phase invariant similarity function for time series. We establish several theoretical properties of cross-correlation showing that it is intuitively appealing and algorithmically tractable, but not positive semidefinite, and therefore not generally applicable with kernel methods. As a solution we introduce a positive semidefinite similarity function with the same intuitive appeal as cross-correlation. An experimental evaluation in the astronomy domain as well as several other data sets demonstrates the performance of the kernel and related similarity functions

CiteSeerX

Crossref