235 research outputs found

    A Generally Semisupervised Dimensionality Reduction Method with Local and Global Regression Regularizations for Recognition

    Get PDF
    The insufficiency of labeled data is an important problem in image classification such as face recognition. However, unlabeled data are abundant in the real-world application. Therefore, semisupervised learning methods, which corporate a few labeled data and a large number of unlabeled data into learning, have received more and more attention in the field of face recognition. During the past years, graph-based semisupervised learning has been becoming a popular topic in the area of semisupervised learning. In this chapter, we newly present graph-based semisupervised learning method for face recognition. The presented method is based on local and global regression regularization. The local regression regularization has adopted a set of local classification functions to preserve both local discriminative and geometrical information, as well as to reduce the bias of outliers and handle imbalanced data; while the global regression regularization is to preserve the global discriminative information and to calculate the projection matrix for out-of-sample extrapolation. Extensive simulations based on synthetic and real-world datasets verify the effectiveness of the proposed method

    Kernel Feature Extraction Methods for Remote Sensing Data Analysis

    Get PDF
    Technological advances in the last decades have improved our capabilities of collecting and storing high data volumes. However, this makes that in some fields, such as remote sensing several problems are generated in the data processing due to the peculiar characteristics of their data. High data volume, high dimensionality, heterogeneity and their nonlinearity, make that the analysis and extraction of relevant information from these images could be a bottleneck for many real applications. The research applying image processing and machine learning techniques along with feature extraction, allows the reduction of the data dimensionality while keeps the maximum information. Therefore, developments and applications of feature extraction methodologies using these techniques have increased exponentially in remote sensing. This improves the data visualization and the knowledge discovery. Several feature extraction methods have been addressed in the literature depending on the data availability, which can be classified in supervised, semisupervised and unsupervised. In particular, feature extraction can use in combination with kernel methods (nonlinear). The process for obtaining a space that keeps greater information content is facilitated by this combination. One of the most important properties of the combination is that can be directly used for general tasks including classification, regression, clustering, ranking, compression, or data visualization. In this Thesis, we address the problems of different nonlinear feature extraction approaches based on kernel methods for remote sensing data analysis. Several improvements to the current feature extraction methods are proposed to transform the data in order to make high dimensional data tasks easier, such as classification or biophysical parameter estimation. This Thesis focus on three main objectives to reach these improvements in the current feature extraction methods: The first objective is to include invariances into supervised kernel feature extraction methods. Throughout these invariances it is possible to generate virtual samples that help to mitigate the problem of the reduced number of samples in supervised methods. The proposed algorithm is a simple method that essentially generates new (synthetic) training samples from available labeled samples. These samples along with original samples should be used in feature extraction methods obtaining more independent features between them that without virtual samples. The introduction of prior knowledge by means of the virtual samples could obtain classification and biophysical parameter estimation methods more robust than without them. The second objective is to use the generative kernels, i.e. probabilistic kernels, that directly learn by means of clustering techniques from original data by finding local-to-global similarities along the manifold. The proposed kernel is useful for general feature extraction purposes. Furthermore, the kernel attempts to improve the current methods because the kernel not only contains labeled data information but also uses the unlabeled information of the manifold. Moreover, the proposed kernel is parameter free in contrast with the parameterized functions such as, the radial basis function (RBF). Using probabilistic kernels is sought to obtain new unsupervised and semisupervised methods in order to reduce the number and cost of labeled data in remote sensing. Third objective is to develop new kernel feature extraction methods for improving the features obtained by the current methods. Optimizing the functional could obtain improvements in new algorithm. For instance, the Optimized Kernel Entropy Component Analysis (OKECA) method. The method is based on the Independent Component Analysis (ICA) framework resulting more efficient than the standard Kernel Entropy Component Analysis (KECA) method in terms of dimensionality reduction. In this Thesis, the methods are focused on remote sensing data analysis. Nevertheless, feature extraction methods are used to analyze data of several research fields whereas data are multidimensional. For these reasons, the results are illustrated into experimental sequence. First, the projections are analyzed by means of Toy examples. The algorithms are tested through standard databases with supervised information to proceed to the last step, the analysis of remote sensing images by the proposed methods

    Detecting Calls to Action in Text Using Deep Learning

    Get PDF

    Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods

    Get PDF
    Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques. The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns. The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other. The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques. The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy

    Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects

    Get PDF
    Hyperspectral Imaging (HSI) has been extensively utilized in many real-life applications because it benefits from the detailed spectral information contained in each pixel. Notably, the complex characteristics i.e., the nonlinear relation among the captured spectral information and the corresponding object of HSI data make accurate classification challenging for traditional methods. In the last few years, Deep Learning (DL) has been substantiated as a powerful feature extractor that effectively addresses the nonlinear problems that appeared in a number of computer vision tasks. This prompts the deployment of DL for HSI classification (HSIC) which revealed good performance. This survey enlists a systematic overview of DL for HSIC and compared state-of-the-art strategies of the said topic. Primarily, we will encapsulate the main challenges of traditional machine learning for HSIC and then we will acquaint the superiority of DL to address these problems. This survey breakdown the state-of-the-art DL frameworks into spectral-features, spatial-features, and together spatial-spectral features to systematically analyze the achievements (future research directions as well) of these frameworks for HSIC. Moreover, we will consider the fact that DL requires a large number of labeled training examples whereas acquiring such a number for HSIC is challenging in terms of time and cost. Therefore, this survey discusses some strategies to improve the generalization performance of DL strategies which can provide some future guidelines
    corecore