509 research outputs found

    Fuzzy c-mean algorithm based on complete mahalanobis distances and separable criterion

    Get PDF
    [[abstract]]In search of good classifier of hosts of influenza A viruses is an important issue to prevent pandemic flu. The hemagglutinin protein in the virus genome is the major molecule that determining the range of hosts. In this paper, a novel classification algorithm of hemagglutinin proteins integrating SVM and logistic regression based on 4 kinds of Hurst exponents for each protein sequence is proposed. This method not used before is the first one integrating the physicochemical properties, fractal property, SVM and logistic regression classifier. For evaluating the performance of this new algorithm, a real data experiment by using 5-fold Cross-Validation accuracy is conducted. Experimental result shows that this new classification algorithm is useful and batter than SVM and logistic regression, respectively. ©2008 IEEE

    Algorithms for enhancing pattern separability, feature selection and incremental learning with applications to gas sensing electronic nose systems

    Get PDF
    Three major issues in pattern recognition and data analysis have been addressed in this study and applied to the problem of identification of volatile organic compounds (VOC) for gas sensing applications. Various approaches have been proposed and discussed. These approaches are not only applicable to the VOC identification, but also to a variety of pattern recognition and data analysis problems. In particular, (1) enhancing pattern separability for challenging classification problems, (2) optimum feature selection problem, and (3) incremental learning for neural networks have been investigated;Three different approaches are proposed for enhancing pattern separability for classification of closely spaced, or possibly overlapping clusters. In the neurofuzzy approach, a fuzzy inference system that considers the dynamic ranges of individual features is developed. Feature range stretching (FRS) is introduced as an alternative approach for increasing intercluster distances by mapping the tight dynamic range of each feature to a wider range through a nonlinear function. Finally, a third approach, nonlinear cluster transformation (NCT), is proposed, which increases intercluster distances while preserving intracluster distances. It is shown that NCT achieves comparable, or better, performance than the other two methods at a fraction of the computational burden. The implementation issues and relative advantages and disadvantages of these approaches are systematically investigated;Selection of optimum features is addressed using both a decision tree based approach, and a wrapper approach. The hill-climb search based wrapper approach is applied for selection of the optimum features for gas sensing problems;Finally, a new method, Learn++, is proposed that gives classification algorithms, the capability of incrementally learning from new data. Learn++ is introduced for incremental learning of new data, when the original database is no longer available. Learn++ algorithm is based on strategically combining an ensemble of classifiers, each of which is trained to learn only a small portion of the pattern space. Furthermore, Learn++ is capable of learning new data even when new classes are introduced, and it also features a built-in mechanism for estimating the reliability of its classification decision;All proposed methods are explained in detail and simulation results are discussed along with directions for future work

    Methods for Pattern Classification

    Get PDF

    An overview of clustering methods with guidelines for application in mental health research

    Get PDF
    Cluster analyzes have been widely used in mental health research to decompose inter-individual heterogeneity by identifying more homogeneous subgroups of individuals. However, despite advances in new algorithms and increasing popularity, there is little guidance on model choice, analytical framework and reporting requirements. In this paper, we aimed to address this gap by introducing the philosophy, design, advantages/disadvantages and implementation of major algorithms that are particularly relevant in mental health research. Extensions of basic models, such as kernel methods, deep learning, semi-supervised clustering, and clustering ensembles are subsequently introduced. How to choose algorithms to address common issues as well as methods for pre-clustering data processing, clustering evaluation and validation are then discussed. Importantly, we also provide general guidance on clustering workflow and reporting requirements. To facilitate the implementation of different algorithms, we provide information on R functions and librarie

    The Random Forest Algorithm with Application to Multispectral Image Analysis

    Get PDF
    The need for computers to make educated decisions is growing. Various methods have been developed for decision making using observation vectors. Among these are supervised and unsupervised classifiers. Recently, there has been increased attention to ensemble learning--methods that generate many classifiers and aggregate their results. Breiman (2001) proposed Random Forests for classification and clustering. The Random Forest algorithm is ensemble learning using the decision tree principle. Input vectors are used to grow decision trees and build a forest. A classification decision is reached by sending an unknown input vector down each tree in the forest and taking the majority vote among all trees. The main focus of this research is to evaluate the effectiveness of Random Forest in classifying pixels in multispectral image data acquired using satellites. In this paper the effectiveness and accuracy of Random Forest, neural networks, support vector machines, and nearest neighbor classifiers are assessed by classifying multispectral images and comparing each classifier\u27s results. As unsupervised classifiers are also widely used, this research compares the accuracy of an unsupervised Random Forest classifier with the Mahalanobis distance classifier, maximum likelihood classifier, and minimum distance classifier with respect to multispectral satellite data

    K-means based clustering and context quantization

    Get PDF

    Intelligent video surveillance

    Get PDF
    In the focus of this thesis are the new and modified algorithms for object detection, recognition and tracking within the context of video analytics. The manual video surveillance has been proven to have low effectiveness and, at the same time, high expense because of the need in manual labour of operators, which are additionally prone to erroneous decisions. Along with increase of the number of surveillance cameras, there is a strong need to push for automatisation of the video analytics. The benefits of this approach can be found both in military and civilian applications. For military applications, it can help in localisation and tracking of objects of interest. For civilian applications, the similar object localisation procedures can make the criminal investigations more effective, extracting the meaningful data from the massive video footage. Recently, the wide accessibility of consumer unmanned aerial vehicles has become a new threat as even the simplest and cheapest airborne vessels can carry some cargo that means they can be upgraded to a serious weapon. Additionally they can be used for spying that imposes a threat to a private life. The autonomous car driving systems are now impossible without applying machine vision methods. The industrial applications require automatic quality control, including non-destructive methods and particularly methods based on the video analysis. All these applications give a strong evidence in a practical need in machine vision algorithms for object detection, tracking and classification and gave a reason for writing this thesis. The contributions to knowledge of the thesis consist of two main parts: video tracking and object detection and recognition, unified by the common idea of its applicability to video analytics problems. The novel algorithms for object detection and tracking, described in this thesis, are unsupervised and have only a small number of parameters. The approach is based on rigid motion segmentation by Bayesian filtering. The Bayesian filter, which was proposed specially for this method and contributes to its novelty, is formulated as a generic approach, and then applied to the video analytics problems. The method is augmented with optional object coordinate estimation using plain two-dimensional terrain assumption which gives a basis for the algorithm usage inside larger sensor data fusion models. The proposed approach for object detection and classification is based on the evolving systems concept and the new Typicality-Eccentricity Data Analytics (TEDA) framework. The methods are capable of solving classical problems of data mining: clustering, classification, and regression. The methods are proposed in a domain-independent way and are capable of addressing shift and drift of the data streams. Examples are given for the clustering and classification of the imagery data. For all the developed algorithms, the experiments have shown sustainable results on the testing data. The practical applications of the proposed algorithms are carefully examined and tested
    • …
    corecore