96 research outputs found

    Independent EEG Sources Are Dipolar

    Get PDF
    Independent component analysis (ICA) and blind source separation (BSS) methods are increasingly used to separate individual brain and non-brain source signals mixed by volume conduction in electroencephalographic (EEG) and other electrophysiological recordings. We compared results of decomposing thirteen 71-channel human scalp EEG datasets by 22 ICA and BSS algorithms, assessing the pairwise mutual information (PMI) in scalp channel pairs, the remaining PMI in component pairs, the overall mutual information reduction (MIR) effected by each decomposition, and decomposition ‘dipolarity’ defined as the number of component scalp maps matching the projection of a single equivalent dipole with less than a given residual variance. The least well-performing algorithm was principal component analysis (PCA); best performing were AMICA and other likelihood/mutual information based ICA methods. Though these and other commonly-used decomposition methods returned many similar components, across 18 ICA/BSS algorithms mean dipolarity varied linearly with both MIR and with PMI remaining between the resulting component time courses, a result compatible with an interpretation of many maximally independent EEG components as being volume-conducted projections of partially-synchronous local cortical field activity within single compact cortical domains. To encourage further method comparisons, the data and software used to prepare the results have been made available (http://sccn.ucsd.edu/wiki/BSSComparison)

    POINT SPREAD FUNCTION ESTIMATION AND UNCERTAINTY QUANTIFICATION

    Get PDF
    An important component of analyzing images quantitatively is modeling image blur due to eects from the system for image capture. When the eect of image blur is assumed to be translation invariant and isotropic, it can be generally modeled as convolution with a radially symmetric kernel, called the point spread function (PSF). Standard techniques for estimating the PSF involve imaging a bright point source, but this is not always feasible (e.g. high energy radiography). This work provides a novel non-parametric approach to estimating the PSF from a calibration image of a vertical edge. Moreover, the approach is within a hierarchical Bayesian framework that in addition to providing a method for estimation, also gives a quantification of uncertainty in the estimate by Markov Chain Monte Carlo (MCMC) methods. In the development, we employ a recently developed enhancement to Gibbs sampling, referred to as partial collapse. The improved algorithm has been independently derived in several other works, however, it has been shown that partial collapse may be improperly implemented resulting in a sampling algorithm that that no longer converges to the desired posterior. The algorithm we present is proven to satisfy invariance with respect to the target density. This work and its implementation on radiographic data from the U.S. Department of Energy\u27s Cygnus high-energy X-ray diagnostic system have culminated in a paper titled \Partially Collapsed Gibbs Samplers for Linear Inverse Problems and Applications to X-ray Imaging. The other component of this work is mainly theoretical and develops the requisite functional analysis to make the integration based model derived in the first chapter rigorous. The literature source is from functional analysis related to distribution theory for linear partial differential equations, and briefly addresses infinite dimensional probability theory for Hilbert space-valued stochastic processes, a burgeoning and very active research area for the analysis of inverse problems. To our knowledge, this provides a new development of a notion of radial symmetry for L2 based distributions. This work results in defining an L2 complete space of radially symmetric distributions, which is an important step toward rigorously placing the PSF estimation problem in the infinite dimensional framework and is part of ongoing work toward that end

    EEGLAB, SIFT, NFT, BCILAB, and ERICA: New Tools for Advanced EEG Processing

    Get PDF
    We describe a set of complementary EEG data collection and processing tools recently developed at the Swartz Center for Computational Neuroscience (SCCN) that connect to and extend the EEGLAB software environment, a freely available and readily extensible processing environment running under Matlab. The new tools include (1) a new and flexible EEGLAB STUDY design facility for framing and performing statistical analyses on data from multiple subjects; (2) a neuroelectromagnetic forward head modeling toolbox (NFT) for building realistic electrical head models from available data; (3) a source information flow toolbox (SIFT) for modeling ongoing or event-related effective connectivity between cortical areas; (4) a BCILAB toolbox for building online brain-computer interface (BCI) models from available data, and (5) an experimental real-time interactive control and analysis (ERICA) environment for real-time production and coordination of interactive, multimodal experiments

    Methodologies in factor modeling

    Get PDF

    Stochastic Particle Flow for Nonlinear High-Dimensional Filtering Problems

    Get PDF
    A series of novel filters for probabilistic inference that propose an alternative way of performing Bayesian updates, called particle flow filters, have been attracting recent interest. These filters provide approximate solutions to nonlinear filtering problems. They do so by defining a continuum of densities between the prior probability density and the posterior, i.e. the filtering density. Building on these methods' successes, we propose a novel filter. The new filter aims to address the shortcomings of sequential Monte Carlo methods when applied to important nonlinear high-dimensional filtering problems. The novel filter uses equally weighted samples, each of which is associated with a local solution of the Fokker-Planck equation. This hybrid of Monte Carlo and local parametric approximation gives rise to a global approximation of the filtering density of interest. We show that, when compared with state-of-the-art methods, the Gaussian-mixture implementation of the new filtering technique, which we call Stochastic Particle Flow, has utility in the context of benchmark nonlinear high-dimensional filtering problems. In addition, we extend the original particle flow filters for tackling multi-target multi-sensor tracking problems to enable a comparison with the new filter

    Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining

    Get PDF
    Knowledge Discovery in Databases (KDD) ist der Prozess, nicht-triviale Muster aus großen Datenbanken zu extrahieren, mit dem Ziel, dass diese bisher unbekannt, potentiell nützlich, statistisch fundiert und verständlich sind. Der Prozess umfasst mehrere Schritte wie die Selektion, Vorverarbeitung, Evaluierung und den Analyseschritt, der als Data-Mining bekannt ist. Eine der zentralen Aufgabenstellungen im Data-Mining ist die Ausreißererkennung, das Identifizieren von Beobachtungen, die ungewöhnlich sind und mit der Mehrzahl der Daten inkonsistent erscheinen. Solche seltene Beobachtungen können verschiedene Ursachen haben: Messfehler, ungewöhnlich starke (aber dennoch genuine) Abweichungen, beschädigte oder auch manipulierte Daten. In den letzten Jahren wurden zahlreiche Verfahren zur Erkennung von Ausreißern vorgeschlagen, die sich oft nur geringfügig zu unterscheiden scheinen, aber in den Publikationen experimental als ``klar besser'' dargestellt sind. Ein Schwerpunkt dieser Arbeit ist es, die unterschiedlichen Verfahren zusammenzuführen und in einem gemeinsamen Formalismus zu modularisieren. Damit wird einerseits die Analyse der Unterschiede vereinfacht, andererseits aber die Flexibilität der Verfahren erhöht, indem man Module hinzufügen oder ersetzen und damit die Methode an geänderte Anforderungen und Datentypen anpassen kann. Um die Vorteile der modularisierten Struktur zu zeigen, werden (i) zahlreiche bestehende Algorithmen in dem Schema formalisiert, (ii) neue Module hinzugefügt, um die Robustheit, Effizienz, statistische Aussagekraft und Nutzbarkeit der Bewertungsfunktionen zu verbessern, mit denen die existierenden Methoden kombiniert werden können, (iii) Module modifiziert, um bestehende und neue Algorithmen auf andere, oft komplexere, Datentypen anzuwenden wie geographisch annotierte Daten, Zeitreihen und hochdimensionale Räume, (iv) mehrere Methoden in ein Verfahren kombiniert, um bessere Ergebnisse zu erzielen, (v) die Skalierbarkeit auf große Datenmengen durch approximative oder exakte Indizierung verbessert. Ausgangspunkt der Arbeit ist der Algorithmus Local Outlier Factor (LOF). Er wird zunächst mit kleinen Erweiterungen modifiziert, um die Robustheit und die Nutzbarkeit der Bewertung zu verbessern. Diese Methoden werden anschließend in einem gemeinsamen Rahmen zur Erkennung lokaler Ausreißer formalisiert, um die entsprechenden Vorteile auch in anderen Algorithmen nutzen zu können. Durch Abstraktion von einem einzelnen Vektorraum zu allgemeinen Datentypen können auch räumliche und zeitliche Beziehungen analysiert werden. Die Verwendung von Unterraum- und Korrelations-basierten Nachbarschaften ermöglicht dann, einen neue Arten von Ausreißern in beliebig orientierten Projektionen zu erkennen. Verbesserungen bei den Bewertungsfunktionen erlauben es, die Bewertung mit der statistischen Intuition einer Wahrscheinlichkeit zu interpretieren und nicht nur eine Ausreißer-Rangfolge zu erstellen wie zuvor. Verbesserte Modelle generieren auch Erklärungen, warum ein Objekt als Ausreißer bewertet wurde. Anschließend werden für verschiedene Module Verbesserungen eingeführt, die unter anderem ermöglichen, die Algorithmen auf wesentlich größere Datensätze anzuwenden -- in annähernd linearer statt in quadratischer Zeit --, indem man approximative Nachbarschaften bei geringem Verlust an Präzision und Effektivität erlaubt. Des weiteren wird gezeigt, wie mehrere solcher Algorithmen mit unterschiedlichen Intuitionen gleichzeitig benutzt und die Ergebnisse in einer Methode kombiniert werden können, die dadurch unterschiedliche Arten von Ausreißern erkennen kann. Schließlich werden für reale Datensätze neue Ausreißeralgorithmen konstruiert, die auf das spezifische Problem angepasst sind. Diese neuen Methoden erlauben es, so aufschlussreiche Ergebnisse zu erhalten, die mit den bestehenden Methoden nicht erreicht werden konnten. Da sie aus den Bausteinen der modularen Struktur entwickelt wurden, ist ein direkter Bezug zu den früheren Ansätzen gegeben. Durch Verwendung der Indexstrukturen können die Algorithmen selbst auf großen Datensätzen effizient ausgeführt werden.Knowledge Discovery in Databases (KDD) is the process of extracting non-trivial patterns in large data bases, with the focus of extracting novel, potentially useful, statistically valid and understandable patterns. The process involves multiple phases including selection, preprocessing, evaluation and the analysis step which is known as Data Mining. One of the key techniques of Data Mining is outlier detection, that is the identification of observations that are unusual and seemingly inconsistent with the majority of the data set. Such rare observations can have various reasons: they can be measurement errors, unusually extreme (but valid) measurements, data corruption or even manipulated data. Over the previous years, various outlier detection algorithms have been proposed that often appear to be only slightly different than previous but ``clearly outperform'' the others in the experiments. A key focus of this thesis is to unify and modularize the various approaches into a common formalism to make the analysis of the actual differences easier, but at the same time increase the flexibility of the approaches by allowing the addition and replacement of modules to adapt the methods to different requirements and data types. To show the benefits of the modularized structure, (i) several existing algorithms are formalized within the new framework (ii) new modules are added that improve the robustness, efficiency, statistical validity and score usability and that can be combined with existing methods (iii) modules are modified to allow existing and new algorithms to run on other, often more complex data types including spatial, temporal and high-dimensional data spaces (iv) the combination of multiple algorithm instances into an ensemble method is discussed (v) the scalability to large data sets is improved using approximate as well as exact indexing. The starting point is the Local Outlier Factor (LOF) algorithm, which is extended with slight modifications to increase robustness and the usability of the produced scores. In order to get the same benefits for other methods, these methods are abstracted to a general framework for local outlier detection. By abstracting from a single vector space, other data types that involve spatial and temporal relationships can be analyzed. The use of subspace and correlation neighborhoods allows the algorithms to detect new kinds of outliers in arbitrarily oriented subspaces. Improvements in the score normalization bring back a statistic intuition of probabilities to the outlier scores that previously were only useful for ranking objects, while improved models also offer explanations of why an object was considered to be an outlier. Subsequently, for different modules found in the framework improved modules are presented that for example allow to run the same algorithms on significantly larger data sets -- in approximately linear complexity instead of quadratic complexity -- by accepting approximated neighborhoods at little loss in precision and effectiveness. Additionally, multiple algorithms with different intuitions can be run at the same time, and the results combined into an ensemble method that is able to detect outliers of different types. Finally, new outlier detection methods are constructed; customized for the specific problems of these real data sets. The new methods allow to obtain insightful results that could not be obtained with the existing methods. Since being constructed from the same building blocks, there however exists a strong and explicit connection to the previous approaches, and by using the indexing strategies introduced earlier, the algorithms can be executed efficiently even on large data sets
    • …
    corecore