68 research outputs found

    Reservoir computing and data visualisation

    Get PDF
    We consider the problem of visualisation of high dimensional multivariate time series. A data analyst in creating a two dimensional projection of such a time series might hope to gain some intuition into the structure of the original high dimensional data set. We review a method for visualising time series data using an extension of Echo State Networks (ESNs). The method uses the multidimensional scaling criterion in order to create a visualisation of the time series after its representation in the reservoir of the ESN. We illustrate the method with two dimensional maps of a financial time series. The method is then compared with a mapping which uses a fixed latent space and a novel objective function

    Bearing fault diagnosis by EXIN CCA

    Get PDF
    EXIN CCA is an extension of the Curvilinear Component Analysis (CCA), which solves for the noninvariant CCA projection and allows representing data drawn under different operating conditions. It can be applied to data visualization, interpretation (as a kind of sensor of the underlying physical phenomenon) and classification for real time industrial applications. Here an example is given for bearing fault diagnostics in an electromechanical device.Peer ReviewedPostprint (published version

    Improved clustering approach for junction detection of multiple edges with modified freeman chain code

    Get PDF
    Image processing framework of two-dimensional line drawing involves three phases that are detecting junction and corner that exist in the drawing, representing the lines, and extracting features to be used in recognizing the line drawing based on the representation scheme used. As an alternative to the existing frameworks, this thesis proposed a framework that consists of improvement in the clustering approach for junction detection of multiple edges, modified Freeman chain code scheme and provide new features and its extraction, and recognition algorithm. This thesis concerns with problem in clustering line drawing for junction detection of multiple edges in the first phase. Major problems in cluster analysis such as time taken and particularly number of accurate clusters contained in the line drawing when performing junction detection are crucial to be addressed. Two clustering approaches are used to compare with the result obtained from the proposed algorithm: self-organising map (SOM) and affinity propagation (AP). These approaches are chosen based on their similarity as unsupervised learning class and do not require initial cluster count to execute. In the second phase, a new chain code scheme is proposed to be used in representing the direction of lines and it consists of series of directional codes and corner labels found in the drawing. In the third phase, namely feature extraction algorithm, three features proposed are length of lines, angle of corners, and number of branches at each corner. These features are then used in the proposed recognition algorithm to match the line drawing, involving only mean and variance in the calculation. Comparison with SOM and AP clustering approaches resulting in up to 31% reduction for cluster count and 57 times faster. The results on corner detection algorithm shows that it is capable to detect junction and corner of the given thinned binary image by producing a new thinned binary image containing markers at their locations

    The h-EXIN CCA for Bearing Fault Diagnosis

    Get PDF
    This paper presents the hierarchical EXIN CCA, which represents a novel and reliable approach to complex pattern recognition problems. The methodology is based on the EXIN CCA, which is an extension of the Curvilinear Component Analysis, for data reduction, and neural networks for data classification. The effectiveness of this condition monitoring scheme is verified in a demanding bearing fault diagnostic scenario

    Towards music perception by redundancy reduction and unsupervised learning in probabilistic models

    Get PDF
    PhDThe study of music perception lies at the intersection of several disciplines: perceptual psychology and cognitive science, musicology, psychoacoustics, and acoustical signal processing amongst others. Developments in perceptual theory over the last fifty years have emphasised an approach based on Shannon’s information theory and its basis in probabilistic systems, and in particular, the idea that perceptual systems in animals develop through a process of unsupervised learning in response to natural sensory stimulation, whereby the emerging computational structures are well adapted to the statistical structure of natural scenes. In turn, these ideas are being applied to problems in music perception. This thesis is an investigation of the principle of redundancy reduction through unsupervised learning, as applied to representations of sound and music. In the first part, previous work is reviewed, drawing on literature from some of the fields mentioned above, and an argument presented in support of the idea that perception in general and music perception in particular can indeed be accommodated within a framework of unsupervised learning in probabilistic models. In the second part, two related methods are applied to two different low-level representations. Firstly, linear redundancy reduction (Independent Component Analysis) is applied to acoustic waveforms of speech and music. Secondly, the related method of sparse coding is applied to a spectral representation of polyphonic music, which proves to be enough both to recognise that the individual notes are the important structural elements, and to recover a rough transcription of the music. Finally, the concepts of distance and similarity are considered, drawing in ideas about noise, phase invariance, and topological maps. Some ecologically and information theoretically motivated distance measures are suggested, and put in to practice in a novel method, using multidimensional scaling (MDS), for visualising geometrically the dependency structure in a distributed representation.Engineering and Physical Science Research Counci

    Longitudinal clustering analysis and prediction of Parkinson\u27s disease progression using radiomics and hybrid machine learning

    Get PDF
    Background: We employed machine learning approaches to (I) determine distinct progression trajectories in Parkinson\u27s disease (PD) (unsupervised clustering task), and (II) predict progression trajectories (supervised prediction task), from early (years 0 and 1) data, making use of clinical and imaging features. Methods: We studied PD-subjects derived from longitudinal datasets (years 0, 1, 2 & 4; Parkinson\u27s Progressive Marker Initiative). We extracted and analyzed 981 features, including motor, non-motor, and radiomics features extracted for each region-of-interest (ROIs: left/right caudate and putamen) using our standardized standardized environment for radiomics analysis (SERA) radiomics software. Segmentation of ROIs on dopamine transposer - single photon emission computed tomography (DAT SPECT) images were performed via magnetic resonance images (MRI). After performing cross-sectional clustering on 885 subjects (original dataset) to identify disease subtypes, we identified optimal longitudinal trajectories using hybrid machine learning systems (HMLS), including principal component analysis (PCA) + K-Means algorithms (KMA) followed by Bayesian information criterion (BIC), Calinski-Harabatz criterion (CHC), and elbow criterion (EC). Subsequently, prediction of the identified trajectories from early year data was performed using multiple HMLSs including 16 Dimension Reduction Algorithms (DRA) and 10 classification algorithms. Results: We identified 3 distinct progression trajectories. Hotelling\u27s t squared test (HTST) showed that the identified trajectories were distinct. The trajectories included those with (I, II) disease escalation (2 trajectories, 27% and 38% of patients) and (III) stable disease (1 trajectory, 35% of patients). For trajectory prediction from early year data, HMLSs including the stochastic neighbor embedding algorithm (SNEA, as a DRA) as well as locally linear embedding algorithm (LLEA, as a DRA), linked with the new probabilistic neural network classifier (NPNNC, as a classifier), resulted in accuracies of 78.4% and 79.2% respectively, while other HMLSs such as SNEA + Lib_SVM (library for support vector machines) and t_SNE (t-distributed stochastic neighbor embedding) + NPNNC resulted in 76.5% and 76.1% respectively. Conclusions: This study moves beyond cross-sectional PD subtyping to clustering of longitudinal disease trajectories. We conclude that combining medical information with SPECT-based radiomics features, and optimal utilization of HMLSs, can identify distinct disease trajectories in PD patients, and enable effective prediction of disease trajectories from early year data

    Divergences for prototype-based classification and causal structure discovery:Theory and application to natural datasets

    Get PDF
    Dit proefschrift bestaat uit twee delen. In het eerste deel beschrijven we hoe de op prototypen gebaseerde classificator LVQ uitgebreid kan worden door gebruik te maken van maten uit de informatie theorie. Daarnaast vergelijken we verschillende manieren van datarepresentatie in deze LVQ configuratie, in dit geval histogrammen van foto’s, SIFT- en SURF-kenmerken. We tonen hoe hiervoor een enkele gecombineerde afstandsmaat kan worden geformuleerd, door de afzonderlijke afstandsmaten samen te nemen. In het tweede deel onderzoeken we het vinden van causale verbanden en toepassingen op problemen die uit het leven zijn gegrepen. Daarnaast verkennen we de combinatie met relevantie leren in LVQ en tonen we enkele toepassingen

    Segmentación y detección de objetos en imágenes y vídeo mediante inteligencia computacional

    Get PDF
    Finalmente, se exponen las conclusiones obtenidas tras la realización de esta tesis y unas posibles líneas futuras de investigación. Fecha de lectura de Tesis: 17 diciembre 2018.La presente tesis trata sobre el procesamiento y análisis de imágenes y video mediante sistemas informáticos. Primeramente se hace una introducción, especificando contexto, objetivos y metodología. Luego se muestran los antecedentes, los fundamentos de la videovigilancia, las dificultades existentes y diversos algoritmos del estado del arte, seguido de las principales características del aprendizaje profundo, transporte inteligente y sistemas con cámara PTZ, finalizando con la evaluación de métodos y distintos conjuntos de datos. Después se muestran tres partes. La primera comenta los estudios desarrollados que tratan sobre segmentación. Aquí se explican diferentes modelos desarrollados cuyo objetivo es la detección de objetos, tanto usando hardware genérico o especifico como en ámbitos específicos, o un estudio de cómo influye la reducción del tamaño de las imágenes al rendimiento de los algoritmos. La segunda parte describe los trabajos que utilizan una cámara PTZ. El primero trabajo hace un seguimiento del objeto más anómalo del escenario, siendo el propio sistema el que decide cuáles son anómalos y cuáles no; el segundo muestra un sistema que indica a la cámara los movimientos a realizar en función de la salida producida por un modelo de fondo no panorámico y mejorada con un gas neuronal creciente. La tercera parte trata sobre los estudios desarrollados con relación con el transporte inteligente, como es la clasificación de los vehículos que aparecen en secuencias de tráfico. El primer trabajo aplica técnicas tradicionales como segmentación y extracción de rasgos; el segundo utiliza segmentación y redes convolucionales, complementado con un estudio del redimensionado de imágenes para proveerlas en el formato necesario a cada red; y el tercero emplea un modelo que detecta y clasifica objetos, estimando posteriormente la contaminación generada por los vehículos

    Differential Privacy, Property Testing, and Perturbations

    Full text link
    Controlling the dissemination of information about ourselves has become a minefield in the modern age. We release data about ourselves every day and don’t always fully understand what information is contained in this data. It is often the case that the combination of seemingly innocuous pieces of data can be combined to reveal more sensitive information about ourselves than we intended. Differential privacy has developed as a technique to prevent this type of privacy leakage. It borrows ideas from information theory to inject enough uncertainty into the data so that sensitive information is provably absent from the privatised data. Current research in differential privacy walks the fine line between removing sensitive information while allowing non-sensitive information to be released. At its heart, this thesis is about the study of information. Many of the results can be formulated as asking a subset of the questions: does the data you have contain enough information to learn what you would like to learn? and how can I affect the data to ensure you can’t discern sensitive information? We will often approach the former question from both directions: information theoretic lower bounds on recovery and algorithmic upper bounds. We begin with an information theoretic lower bound for graphon estimation. This explores the fundamental limits of how much information about the underlying population is contained in a finite sample of data. We then move on to exploring the connection between information theoretic results and privacy in the context of linear inverse problems. We find that there is a discrepancy between how the inverse problems community and the privacy community view good recovery of information. Next, we explore black-box testing for privacy. We argue that the amount of information required to verify the privacy guarantee of an algorithm, without access to the internals of the algorithm, is lower bounded by the amount of information required to break the privacy guarantee. Finally, we explore a setting where imposing privacy is a help rather than a hindrance: online linear optimisation. We argue that private algorithms have the right kind of stability guarantee to ensure low regret for online linear optimisation.PHDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143940/1/amcm_1.pd
    corecore