Search CORE

16 research outputs found

Feature Extraction for Change-Point Detection using Stationary Subspace Analysis

Author: Blythe Duncan
Meinecke Frank
Müller Klaus-Robert
von Bünau Paul
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/08/2011
Field of study

Detecting changes in high-dimensional time series is difficult because it involves the comparison of probability densities that need to be estimated from finite samples. In this paper, we present the first feature extraction method tailored to change point detection, which is based on an extended version of Stationary Subspace Analysis. We reduce the dimensionality of the data to the most non-stationary directions, which are most informative for detecting state changes in the time series. In extensive simulations on synthetic data we show that the accuracy of three change point detection algorithms is significantly increased by a prior feature extraction step. These findings are confirmed in an application to industrial fault monitoring.Comment: 24 pages, 20 figures, journal preprin

arXiv.org e-Print Archive

Crossref

Recommended from our members

Algebraic geometric comparison of probability distributions

Author: Blythe Duncan A.J.
Király Franz J.
Meinecke Frank C.
Müller Klaus-Robert
von Bünau Paul
Publication venue: Oberwolfach : Mathematisches Forschungsinstitut Oberwolfach
Publication date: 01/01/2011
Field of study

We propose a novel algebraic framework for treating probability distributions represented by their cumulants such as the mean and covariance matrix. As an example, we consider the unsupervised learning problem of finding the subspace on which several probability distributions agree. Instead of minimizing an objective function involving the estimated cumulants, we show that by treating the cumulants as elements of the polynomial ring we can directly solve the problem, at a lower computational cost and with higher accuracy. Moreover, the algebraic viewpoint on probability distributions allows us to invoke the theory of Algebraic Geometry, which we demonstrate in a compact proof for an identifiability criterion

Repositorium für Naturwissenschaften und Technik

Higher order stationary subspace analysis

Author: Kawanabe Motoaki
Meinecke Frank C.
Müller Klaus-Robert
Panknin Danny
von Bünau Paul
Publication venue
Publication date: 01/01/2016
Field of study

Non-stationarity in data is an ubiquitous problem in signal processing. The recent stationary subspace analysis procedure (SSA) has enabled to decompose such data into a stationary subspace and a non-stationary part respectively. Algorithmically only weak non- stationarities could be tackled by SSA. The present paper takes the conceptual step generalizing from the use of first and second moments as in SSA to higher order moments, thus defining the proposed higher order stationary subspace analysis procedure (HOSSA). The paper derives the novel procedure and shows simulations. An obvious trade-off between the necessity of estimating higher moments and the accuracy and robustness with which they can be estimated is observed. In an ideal setting of plenty of data where higher moment information is dominating our novel approach can win against standard SSA. However, with limited data, even though higher moments actually dominate the underlying data, still SSA may arrive on par.BMBF, 01IB15001B, Verbundprojekt: ALICE II - Autonomes Lernen in komplexen Umgebungen 2 (Autonomous Learning in Complex Environments 2)BMBF, 01GQ1115, D-JPN Verbund: Adaptive Gehirn-Computer-Schnittstellen (BCI) in nichtstationären UmgebungenDFG, 200318152, Theoretische Konzepte für co-adaptive Mensch-Maschine-Interaktion mit Anwendungen auf BC

DepositOnce

Stationary Subspace Analysis: Analyse nicht-stationärer Daten

Author: Bünau Paul von
Publication venue
Publication date: 01/10/2012
Field of study

Das Thema dieser Dissertation sind statistische Methoden für das Verständnis zeitlicher Veränderung der gemeinsamen Verteilung multivariater Daten. Wir betrachten den sogenannten explorativen Fall, in dem keinerlei zusätzlichen Informationen, z.B. über relevante Zeitpunkte oder einen kontrollierten Stimulus, verfügbar sind. Der zentrale Beitrag dieser Arbeit ist die Entwicklung des ersten unüberwachten Verfahrens, Stationary Subspace Analysis (SSA), welches eine lineare Koordinatentransformation findet, die die beobachteten Daten in eine Gruppe von stationären und nicht-stationären Komponenten faktorisiert. Dies ist unerlässlich zur Analyse multivariater Daten, weil die wesentlichen Änderungen der gemeinsamen Verteilung die Abhängigkeiten zwischen den Variablen betreffen können. Daher erlaubt die univariate Betrachtung der Eingangsgrössen nicht notwendigerweise Rückschlüsse auf Änderungen der gemeinsamen Verteilung. Sowohl die nicht-stationären als auch die stationären Komponenten können nämlich in den beobachten Variablen vollständig unsichtbar bleiben. Dies ist vor allem dann der Fall, wenn die messbaren Grössen Überlagerungen der tatsächlich relevanten Variablen sind, welche nicht direkt gemessen werden können. Ein gutes Beispiel liefert die EEG Datenanalyse: die Elektroden auf der Kopfhaut messen die Beiträge einer Vielzahl neuronaler Quellen im Gehirn. Um die zeitliche Veränderung der Verteilung dieser Quellen zu verstehen ist es daher notwendig, die stationären von den nicht-stationären Signalanteilen zu trennen. In einer Anwendung auf EEG Daten zeigen wir zum Einen, dass SSA dies leistet und zum Anderen, dass die populären Koordinatentransformationen Principal Component Analysis (PCA) und Independent Component Analysis (ICA) dazu nicht in der Lage sind. Der zweite wesentliche Beitrag dieser Arbeit ist ein neuartiger Ansatz zur approximativen Lösung polynomieller Gleichungssystem eines bestimmten Typs. Aufbauend auf dem Konzept der generischen Polynome zeigen wir, das SSA als ein solches Problem formuliert werden kann. Dies führt zu einem neuen Algorithmus dessen Lösung nicht nur eindeutig, sondern in Spezialfällen auch exakter ist. Von einem theoretischen Standpunkt aus gesehen ist es besonders interessant, dass es dieser Ansatz erlaubt das SSA Problem direkt algebraisch zu lösen, anstatt wie in einem Optimierungsverfahren nach der Lösung zu suchen. Da die zugrunde liegenden Annahmen eher allgemein sind, lässt sich der Algorithmus möglicherweise direkt auf andere Probleme im Maschinellen Lernen anwenden, die als Lösung polynomieller Gleichungen formuliert werden können.This thesis is about statistical methods for understanding change in the joint distribution of observed multivariate data over time. The setting we consider is completely explorative or unsupervised: no auxiliary information regarding the distribution changes is available. We propose the first unsupervised method, stationary subspace analysis (SSA), for finding a linear coordinate transformation which factorises the observed data into stationary and non-stationary components. This is essential because the relevant changes can occur in the dependencies between variables, which means that the input variables can be totally uninformative: in fact, the non-stationary and the stationary part can remain completely invisible in the observations. In practice, this is often the case when one can only measure superpositions of the actual variables of interest. For example, in EEG analysis, electrodes on the surface of the scalp record activity from several neural sources located inside the brain. As we show in this thesis, investigating changing behaviour in brain sources crucially depends on the separation of the stationary and non-stationary components in the recorded signals. The second main contribution of this thesis is a novel approach to finding particular types of approximative solutions to systems of polynomial equations of arbitrary degree based on techniques from computational algebraic geometry. Using the concept of generic polynomials, we show how SSA can be formulated in this framework. This leads to a new algorithm which has a unique solution and is more accurate in certain cases. From a theoretical perspective, the most interesting feature of this approach is that it allows us to solve the problem algebraically instead of searching for the solution guided by an objective function. As the assumptions underpinning the algorithm are rather general, it may be directly applicable to other problems in machine learning whose solution can be formulated in terms of polynomial equations

DepositOnce

Abstract

Author: Hisashi Kashima
Masashi Sugiyama
Motoaki Kawanabe
Paul Von Bünau
Shinichi Nakajima
Publication venue
Publication date
Field of study

When training and test samples follow different input distributions (i.e., the situation called covariate shift), the maximum likelihood estimator is known to lose its consistency. For regaining consistency, the log-likelihood terms need to be weighted according to the importance (i.e., the ratio of test and training input densities). Thus, accurately estimating the importance is one of the key tasks in covariate shift adaptation. A naive approach is to first estimate training and test input densities and then estimate the importance by the ratio of the density estimates. However, since density estimation is a hard problem, this approach tends to perform poorly especially in high dimensional cases. In this paper, we propose a direct importance estimation method that does not require the input density estimates. Our method is equipped with a natural model selection procedure so tuning parameters such as the kernel width can be objectively optimized. This is an advantage over a recently developed method of direct importance estimation. Simulations illustrate the usefulness of our approach.

CiteSeerX