16 research outputs found

    Feature Extraction for Change-Point Detection using Stationary Subspace Analysis

    Full text link
    Detecting changes in high-dimensional time series is difficult because it involves the comparison of probability densities that need to be estimated from finite samples. In this paper, we present the first feature extraction method tailored to change point detection, which is based on an extended version of Stationary Subspace Analysis. We reduce the dimensionality of the data to the most non-stationary directions, which are most informative for detecting state changes in the time series. In extensive simulations on synthetic data we show that the accuracy of three change point detection algorithms is significantly increased by a prior feature extraction step. These findings are confirmed in an application to industrial fault monitoring.Comment: 24 pages, 20 figures, journal preprin

    Higher order stationary subspace analysis

    Get PDF
    Non-stationarity in data is an ubiquitous problem in signal processing. The recent stationary subspace analysis procedure (SSA) has enabled to decompose such data into a stationary subspace and a non-stationary part respectively. Algorithmically only weak non- stationarities could be tackled by SSA. The present paper takes the conceptual step generalizing from the use of first and second moments as in SSA to higher order moments, thus defining the proposed higher order stationary subspace analysis procedure (HOSSA). The paper derives the novel procedure and shows simulations. An obvious trade-off between the necessity of estimating higher moments and the accuracy and robustness with which they can be estimated is observed. In an ideal setting of plenty of data where higher moment information is dominating our novel approach can win against standard SSA. However, with limited data, even though higher moments actually dominate the underlying data, still SSA may arrive on par.BMBF, 01IB15001B, Verbundprojekt: ALICE II - Autonomes Lernen in komplexen Umgebungen 2 (Autonomous Learning in Complex Environments 2)BMBF, 01GQ1115, D-JPN Verbund: Adaptive Gehirn-Computer-Schnittstellen (BCI) in nichtstationären UmgebungenDFG, 200318152, Theoretische Konzepte für co-adaptive Mensch-Maschine-Interaktion mit Anwendungen auf BC

    Stationary Subspace Analysis: Analyse nicht-stationärer Daten

    No full text
    Das Thema dieser Dissertation sind statistische Methoden für das Verständnis zeitlicher Veränderung der gemeinsamen Verteilung multivariater Daten. Wir betrachten den sogenannten explorativen Fall, in dem keinerlei zusätzlichen Informationen, z.B. über relevante Zeitpunkte oder einen kontrollierten Stimulus, verfügbar sind. Der zentrale Beitrag dieser Arbeit ist die Entwicklung des ersten unüberwachten Verfahrens, Stationary Subspace Analysis (SSA), welches eine lineare Koordinatentransformation findet, die die beobachteten Daten in eine Gruppe von stationären und nicht-stationären Komponenten faktorisiert. Dies ist unerlässlich zur Analyse multivariater Daten, weil die wesentlichen Änderungen der gemeinsamen Verteilung die Abhängigkeiten zwischen den Variablen betreffen können. Daher erlaubt die univariate Betrachtung der Eingangsgrössen nicht notwendigerweise Rückschlüsse auf Änderungen der gemeinsamen Verteilung. Sowohl die nicht-stationären als auch die stationären Komponenten können nämlich in den beobachten Variablen vollständig unsichtbar bleiben. Dies ist vor allem dann der Fall, wenn die messbaren Grössen Überlagerungen der tatsächlich relevanten Variablen sind, welche nicht direkt gemessen werden können. Ein gutes Beispiel liefert die EEG Datenanalyse: die Elektroden auf der Kopfhaut messen die Beiträge einer Vielzahl neuronaler Quellen im Gehirn. Um die zeitliche Veränderung der Verteilung dieser Quellen zu verstehen ist es daher notwendig, die stationären von den nicht-stationären Signalanteilen zu trennen. In einer Anwendung auf EEG Daten zeigen wir zum Einen, dass SSA dies leistet und zum Anderen, dass die populären Koordinatentransformationen Principal Component Analysis (PCA) und Independent Component Analysis (ICA) dazu nicht in der Lage sind. Der zweite wesentliche Beitrag dieser Arbeit ist ein neuartiger Ansatz zur approximativen Lösung polynomieller Gleichungssystem eines bestimmten Typs. Aufbauend auf dem Konzept der generischen Polynome zeigen wir, das SSA als ein solches Problem formuliert werden kann. Dies führt zu einem neuen Algorithmus dessen Lösung nicht nur eindeutig, sondern in Spezialfällen auch exakter ist. Von einem theoretischen Standpunkt aus gesehen ist es besonders interessant, dass es dieser Ansatz erlaubt das SSA Problem direkt algebraisch zu lösen, anstatt wie in einem Optimierungsverfahren nach der Lösung zu suchen. Da die zugrunde liegenden Annahmen eher allgemein sind, lässt sich der Algorithmus möglicherweise direkt auf andere Probleme im Maschinellen Lernen anwenden, die als Lösung polynomieller Gleichungen formuliert werden können.This thesis is about statistical methods for understanding change in the joint distribution of observed multivariate data over time. The setting we consider is completely explorative or unsupervised: no auxiliary information regarding the distribution changes is available. We propose the first unsupervised method, stationary subspace analysis (SSA), for finding a linear coordinate transformation which factorises the observed data into stationary and non-stationary components. This is essential because the relevant changes can occur in the dependencies between variables, which means that the input variables can be totally uninformative: in fact, the non-stationary and the stationary part can remain completely invisible in the observations. In practice, this is often the case when one can only measure superpositions of the actual variables of interest. For example, in EEG analysis, electrodes on the surface of the scalp record activity from several neural sources located inside the brain. As we show in this thesis, investigating changing behaviour in brain sources crucially depends on the separation of the stationary and non-stationary components in the recorded signals. The second main contribution of this thesis is a novel approach to finding particular types of approximative solutions to systems of polynomial equations of arbitrary degree based on techniques from computational algebraic geometry. Using the concept of generic polynomials, we show how SSA can be formulated in this framework. This leads to a new algorithm which has a unique solution and is more accurate in certain cases. From a theoretical perspective, the most interesting feature of this approach is that it allows us to solve the problem algebraically instead of searching for the solution guided by an objective function. As the assumptions underpinning the algorithm are rather general, it may be directly applicable to other problems in machine learning whose solution can be formulated in terms of polynomial equations

    Abstract

    No full text
    When training and test samples follow different input distributions (i.e., the situation called covariate shift), the maximum likelihood estimator is known to lose its consistency. For regaining consistency, the log-likelihood terms need to be weighted according to the importance (i.e., the ratio of test and training input densities). Thus, accurately estimating the importance is one of the key tasks in covariate shift adaptation. A naive approach is to first estimate training and test input densities and then estimate the importance by the ratio of the density estimates. However, since density estimation is a hard problem, this approach tends to perform poorly especially in high dimensional cases. In this paper, we propose a direct importance estimation method that does not require the input density estimates. Our method is equipped with a natural model selection procedure so tuning parameters such as the kernel width can be objectively optimized. This is an advantage over a recently developed method of direct importance estimation. Simulations illustrate the usefulness of our approach.
    corecore