5 research outputs found

    Temporal - spatial recognizer for multi-label data

    Get PDF
    Pattern recognition is an important artificial intelligence task with practical applications in many fields such as medical and species distribution. Such application involves overlapping data points which are demonstrated in the multi- label dataset. Hence, there is a need for a recognition algorithm that can separate the overlapping data points in order to recognize the correct pattern. Existing recognition methods suffer from sensitivity to noise and overlapping points as they could not recognize a pattern when there is a shift in the position of the data points. Furthermore, the methods do not implicate temporal information in the process of recognition, which leads to low quality of data clustering. In this study, an improved pattern recognition method based on Hierarchical Temporal Memory (HTM) is proposed to solve the overlapping in data points of multi- label dataset. The imHTM (Improved HTM) method includes improvement in two of its components; feature extraction and data clustering. The first improvement is realized as TS-Layer Neocognitron algorithm which solves the shift in position problem in feature extraction phase. On the other hand, the data clustering step, has two improvements, TFCM and cFCM (TFCM with limit- Chebyshev distance metric) that allows the overlapped data points which occur in patterns to be separated correctly into the relevant clusters by temporal clustering. Experiments on five datasets were conducted to compare the proposed method (imHTM) against statistical, template and structural pattern recognition methods. The results showed that the percentage of success in recognition accuracy is 99% as compared with the template matching method (Featured-Based Approach, Area-Based Approach), statistical method (Principal Component Analysis, Linear Discriminant Analysis, Support Vector Machines and Neural Network) and structural method (original HTM). The findings indicate that the improved HTM can give an optimum pattern recognition accuracy, especially the ones in multi- label dataset

    Informative Data Fusion: Beyond Canonical Correlation Analysis

    Full text link
    Multi-modal data fusion is a challenging but common problem arising in fields such as economics, statistical signal processing, medical imaging, and machine learning. In such applications, we have access to multiple datasets that use different data modalities to describe some system feature. Canonical correlation analysis (CCA) is a multidimensional joint dimensionality reduction algorithm for exactly two datasets. CCA finds a linear transformation for each feature vector set such that the correlation between the two transformed feature sets is maximized. These linear transformations are easily found by solving the SVD of a matrix that only involves the covariance and cross-covariance matrices of the feature vector sets. When these covariance matrices are unknown, an empirical version of CCA substitutes sample covariance estimates formed from training data. However, when the number of training samples is less than the combined dimension of the datasets, CCA fails to reliably detect correlation between the datasets. This thesis explores the the problem of detecting correlations from data modeled by the ubiquitous signal-plus noise data model. We present a modification to CCA, which we call informative CCA (ICCA) that first projects each dataset onto a low-dimensional informative signal subspace. We verify the superior performance of ICCA on real-world datasets and argue the optimality of trim-then-fuse over fuse-then-trim correlation analysis strategies. We provide a significance test for the correlations returned by ICCA and derive improved estimates of the population canonical vectors using insights from random matrix theory. We then extend the analysis of CCA to regularized CCA (RCCA) and demonstrate that setting the regularization parameter to infinity results in the best performance and has the same solution as taking the SVD of the cross-covariance matrix of the two datasets. Finally, we apply the ideas learned from ICCA to multiset CCA (MCCA), which analyzes correlations for more than two datasets. There are multiple formulations of multiset CCA (MCCA), each using a different combination of objective function and constraint function to describe a notion of multiset correlation. We consider MAXVAR, provide an informative version of the algorithm, which we call informative MCCA (IMCCA), and demonstrate its superiority on a real-world dataset.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113419/1/asendorf_1.pd
    corecore