770,953 research outputs found

    A phenomenological approach to multisource data integration: Analysing infrared and visible data

    Get PDF
    A new method is described for combining multisensory data for remote sensing applications. The approach uses phenomenological models which allow the specification of discriminatory features that are based on intrinsic physical properties of imaged surfaces. Thermal and visual images of scenes are analyzed to estimate surface heat fluxes. Such analysis makes available a discriminatory feature that is closely related to the thermal capacitance of the imaged objects. This feature provides a method for labelling image regions based on physical properties of imaged objects. This approach is different from existing approaches which use the signal intensities in each channel (or an arbitrary linear or nonlinear combination of signal intensities) as features - which are then classified by a statistical or evident approach

    Mid-price prediction based on machine learning methods with technical and quantitative indicators

    Get PDF
    Stock price prediction is a challenging task, but machine learning methods have recently been used successfully for this purpose. In this paper, we extract over 270 hand-crafted features (factors) inspired by technical and quantitative analysis and tested their validity on short-term mid-price movement prediction. We focus on a wrapper feature selection method using entropy, least-mean squares, and linear discriminant analysis. We also build a new quantitative feature based on adaptive logistic regression for online learning, which is constantly selected first among the majority of the proposed feature selection methods. This study examines the best combination of features using high frequency limit order book data from Nasdaq Nordic. Our results suggest that sorting methods and classifiers can be used in such a way that one can reach the best performance with a combination of only very few advanced hand-crafted features

    Factorized linear discriminant analysis for phenotype-guided representation learning of neuronal gene expression data

    Get PDF
    A central goal in neurobiology is to relate the expression of genes to the structural and functional properties of neuronal types, collectively called their phenotypes. Single-cell RNA sequencing can measure the expression of thousands of genes in thousands of neurons. How to interpret the data in the context of neuronal phenotypes? We propose a supervised learning approach that factorizes the gene expression data into components corresponding to individual phenotypic characteristics and their interactions. This new method, which we call factorized linear discriminant analysis (FLDA), seeks a linear transformation of gene expressions that varies highly with only one phenotypic factor and minimally with the others. We further leverage our approach with a sparsity-based regularization algorithm, which selects a few genes important to a specific phenotypic feature or feature combination. We applied this approach to a single-cell RNA-Seq dataset of Drosophila T4/T5 neurons, focusing on their dendritic and axonal phenotypes. The analysis confirms results obtained by conventional methods but also points to new genes related to the phenotypes and an intriguing hierarchy in the genetic organization of these cells

    Fuzzy discriminant analysis based feature projection in myoelectric control.

    Full text link
    The myoelectric signal (MES) from human muscles is usually utilized as an input to the controller of a multifunction prosthetic hand. In such a system, a pattern recognition approach is usually employed to discriminate between the MES from different classes. Since the MES is recorded using multi channels, the feature vector size can become very large. In order to reduce the computational cost and enhance the generalization capability of the classifier, a dimensionality reduction method is needed to identify an informative moderate size feature set. This paper proposes a novel feature projection technique based on a combination of Fisher's Linear Discriminant Analysis (LDA), and Fuzzy Logic. The new method, called FLDA, assigns different membership degrees to the data points thus reducing the effect of overlapping points in the discrimination process. Furthermore, the concept of Mutual Information (MI) is introduced in the fuzzy memberships in order to assign weights to the features (attributes) according to their contribution to the discrimination process. The FLDA method is tested on a seven classes MES dataset and compared with other feature projection techniques proving its superiority

    Visualization and analysis of diffusion tensor fields

    Get PDF
    technical reportThe power of medical imaging modalities to measure and characterize biological tissue is amplified by visualization and analysis methods that help researchers to see and understand the structures within their data. Diffusion tensor magnetic resonance imaging can measure microstructural properties of biological tissue, such as the coherent linear organization of white matter of the central nervous system, or the fibrous texture of muscle tissue. This dissertation describes new methods for visualizing and analyzing the salient structure of diffusion tensor datasets. Glyphs from superquadric surfaces and textures from reactiondiffusion systems facilitate inspection of data properties and trends. Fiber tractography based on vector-tensor multiplication allows major white matter pathways to be visualized. The generalization of direct volume rendering to tensor data allows large-scale structures to be shaded and rendered. Finally, a mathematical framework for analyzing the derivatives of tensor values, in terms of shape and orientation change, enables analytical shading in volume renderings, and a method of feature detection important for feature-preserving filtering of tensor fields. Together, the combination of methods enhances the ability of diffusion tensor imaging to provide insight into the local and global structure of biological tissue

    Hybrid technique using singular value decomposition (SVD) and support vector machine (SVM) approach for earthquake prediction

    Get PDF
    Most of the existing earthquake (EQ) prediction techniques involve a combination of signal processing and geophysics techniques which are relatively complex in computation for analysis of the Earth’s electric field data. This paper proposes a relatively simpler and faster method that involves only signal processing procedures. The prediction of the EQ occurrence estimation using a combination of singular value decomposition (SVD)-based technique for feature extraction and support vector machine (SVM) classifier is presented in this paper. Using the proposed method, the Earth’s electric field signal is transformed into a new domain using SVD-based approach. In this approach, the time domain signal is projected on the left eigenvectors of impulse response matrix of the linear prediction coefficient (LPC) filter. Several features have been extracted from the transformed signal. These features are used as input for the SVM classifier in order to predict the location of the forthcoming EQ. Once the location is determined, a similar approach is used to estimate its magnitude. Finally, the time estimation of the forthcoming EQ is estimated based on the statistical observation. The occurred EQs during 2008 in Greece are used to train the classifiers, whereas those occurred from 2003 to 2010 in the same region are used to evaluate the performance of the proposed system. In predicting the location of the future EQs, the proposed system could achieve 77% accuracy. As for the magnitude prediction, the proposed system provides an accuracy of 66.67%. Moreover, the predicted time for the EQ with magnitude greater than is 2 days ahead, whereas for magnitude greater than is up to 7 days ahead

    Identifying cell types with single cell sequencing data

    Get PDF
    Single-cell RNA sequencing (scRNA-seq) techniques, which examine the genetic information of individual cells, provide an unparalleled resolution to discern deeply into cellular heterogeneity. On the contrary, traditional RNA sequencing technologies (bulk RNA sequencing technologies), measure the average RNA expression level of a large number of input cells, which are insufficient for studying heterogeneous systems. Hence, scRNA-seq technologies make it possible to tackle many inaccessible problems, such as rare cell types identification, cancer evolution and cell lineage relationship inference. Cell population identification is the fundamental of the analysis of scRNA-seq data. Generally, the workflow of scRNA-seq analysis includes data processing, dropout imputation, feature selection, dimensionality reduction, similarity matrix construction and unsupervised clustering. Many single-cell clustering algorithms rely on similarity matrices of cells, but many existing studies have not received the expectant results. There are some unique challenges in analyzing scRNA-seq data sets, including a significant level of biological and technical noise, so similarity matrix construction still deserves further study. In my study, I present a new method, named Learning Sparse Similarity Matrices (LSSM), to construct cell-cell similarity matrices, and then several clustering methods are used to identify cell populations respectively with scRNA-seq data. Firstly, based on sparse subspace theory, the relationship between a cell and the other cells in the same cell type is expressed by a linear combination. Secondly, I construct a convex optimization objective function to find the similarity matrix, which is consist of the corresponding coefficients of the linear combinations mentioned above. Thirdly, I design an algorithm with column-wise learning and greedy algorithm to solve the objective function. As a result, the large optimization problem on the similarity matrix can be decomposed into a series of smaller optimization problems on the single column of the similarity matrix respectively, and the sparsity of the whole matrix can be ensured by the sparsity of each column. Fourthly, in order to pick an optimal clustering method for identifying cell populations based on the similarity matrix developed by LSSM, I use several clustering methods separately based on the similarity matrix calculated by LSSM from eight scRNA-seq data sets. The clustering results show that my method performs the best when combined with spectral clustering (Laplacian eigenmaps + k-means clustering). In addition, compared with five state-of-the-art methods, my method outperforms most competing methods on eight data sets. Finally, I combine LSSM with t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize the data points of scRNA-seq data in the two-dimensional space. The results show that for most data points, in the same cell types they are close, while from different cell clusters, they are separated
    corecore