168 research outputs found

    Multiple Classifier System for Remote Sensing Image Classification: A Review

    Get PDF
    Over the last two decades, multiple classifier system (MCS) or classifier ensemble has shown great potential to improve the accuracy and reliability of remote sensing image classification. Although there are lots of literatures covering the MCS approaches, there is a lack of a comprehensive literature review which presents an overall architecture of the basic principles and trends behind the design of remote sensing classifier ensemble. Therefore, in order to give a reference point for MCS approaches, this paper attempts to explicitly review the remote sensing implementations of MCS and proposes some modified approaches. The effectiveness of existing and improved algorithms are analyzed and evaluated by multi-source remotely sensed images, including high spatial resolution image (QuickBird), hyperspectral image (OMISII) and multi-spectral image (Landsat ETM+). Experimental results demonstrate that MCS can effectively improve the accuracy and stability of remote sensing image classification, and diversity measures play an active role for the combination of multiple classifiers. Furthermore, this survey provides a roadmap to guide future research, algorithm enhancement and facilitate knowledge accumulation of MCS in remote sensing community

    Innovative Two-Stage Fuzzy Classification for Unknown Intrusion Detection

    Get PDF
    Intrusion detection is the essential part of network security in combating against illegal network access or malicious cyberattacks. Due to the constantly evolving nature of cyber attacks, it has been a technical challenge for an intrusion detection system (IDS) to effectively recognize unknown attacks or known attacks with inadequate training data. Therefore in this dissertation work, an innovative two-stage classifier is developed for accurately and efficiently detecting both unknown attacks and known attacks with insufficient or inaccurate training information. The novel two-stage fuzzy classification scheme is based on advanced machine learning techniques specifically for handling the ambiguity of traffic connections and network data. In the first stage of the classification, a fuzzy C-means (FCM) algorithm is employed to softly compute and optimize clustering centers of the training datasets with some degree of fuzziness counting for feature inaccuracy and ambiguity in the training data. Subsequently, a distance-weighted k-NN (k-nearest neighbors) classifier, combined with the Dempster-Shafer Theory (DST), is introduced to assess the belief functions and pignistic probabilities of the incoming data associated with each of known classes to further address the data uncertainty issue in the cyberattack data. In the second stage of the proposed classification algorithm, a subsequent classification scheme is implemented based on the obtained pignistic probabilities and their entropy functions to determine if the input data are normal, one of the known attacks or an unknown attack. Secondly, to strengthen the robustness to attacks, we form the three-layer hierarchy ensemble classifier based on the FCM weighted k-NN DST classifier to have more precise inferences than those made by a single classifier. The proposed intrusion detection algorithm is evaluated through the application of the KDD’99 datasets and their variants containing known and unknown attacks. The experimental results show that the new two-stage fuzzy KNN-DST classifier outperforms other well-known classifiers in intrusion detection and is especially effective in detecting unknown attacks

    Network intrusion detection with sensor fusion : performance bounds and benchmarks

    Get PDF
    Abstract: The achievable performances of intrusion detection systems are unknown beforehand. Currently, intrusion detection researchers implement these systems before they can determine what the performances of their systems will be or compare the performance of their systems to existing systems in order to evaluate the performances of their systems . Another challenge of network researchers is the unavailability of real world traffic traces of network activities due to privacy and legal restrictions. This Thesis contributes to the literature by 1. presenting the achievable performances of the existing anomaly and learning based network intrusion detection systems (NIDSs) in detecting the Transmission Control Protocol (TCP) synchronised (SYN) flooding attacks. Two anomaly based algorithms, adaptive threshold and cumulative sum based algorithms were considered in building the anomaly based NIDSs. The logic OR operator was used to combine the outcomes of the two anomaly based algorithms to enhance their performance. The three algorithms were used to detect TCP SYN flooding attacks that were synthetically generated according to a Poisson process and constant interarrival times. The logic OR operator performed better than the two algorithms. The three algorithms detected the Poisson process attacks better than the constant interarrival times attacks. For the learning based NIDSs, the decision tree and a novel fuzzy logic based NIDSs were used to detect Neptune, which is a type of a TCP SYN flooding attack. The decision tree outperformed the fuzzy logic system. 2. providing the achievable upper bounds on the accuracies of two ensembles of classifiers based NIDSs. The first NIDS is an AdaBoost based ensemble that uses decision stamp as a base learner. The second NIDS is a Bagging based ensemble that uses a decision tree as a base learner. The obtained bounds will enable researchers to estimate the performance of their ensemble based NIDSs before they implement them and determine how well their ensemble based NIDSs are performing relative to these bounds. From the empirical studies, it was deduced that if the dataset entropy with respect to the features falls between 0.9578 to 0.9586 and the average information gain amongst the features used in the ensemble falls between 0.045615 and 0.25615 then the accuracy of the first NIDS will be at most 0.9065 and the accuracy of the second NIDS will be at best 0.9193. These obtained ensemble accuracy upper bounds hold irrespective of the attack or dataset provided that the features used in the ensemble (AdaBoosted decision stump ensemble or Bagged decision tree ensemble) have the same characteristics as the features used in this Thesis and the features are discretised in the same way as in this work...D.Phil

    Multimedia Decision Fusion

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Context-dependent fusion with application to landmine detection.

    Get PDF
    Traditional machine learning and pattern recognition systems use a feature descriptor to describe the sensor data and a particular classifier (also called expert or learner ) to determine the true class of a given pattern. However, for complex detection and classification problems, involving data with large intra-class variations and noisy inputs, no single source of information can provide a satisfactory solution. As a result, combination of multiple classifiers is playing an increasing role in solving these complex pattern recognition problems, and has proven to be viable alternative to using a single classifier. In this thesis we introduce a new Context-Dependent Fusion (CDF) approach, We use this method to fuse multiple algorithms which use different types of features and different classification methods on multiple sensor data. The proposed approach is motivated by the observation that there is no single algorithm that can consistently outperform all other algorithms. In fact, the relative performance of different algorithms can vary significantly depending on several factions such as extracted features, and characteristics of the target class. The CDF method is a local approach that adapts the fusion method to different regions of the feature space. The goal is to take advantages of the strengths of few algorithms in different regions of the feature space without being affected by the weaknesses of the other algorithms and also avoiding the loss of potentially valuable information provided by few weak classifiers by considering their output as well. The proposed fusion has three main interacting components. The first component, called Context Extraction, partitions the composite feature space into groups of similar signatures, or contexts. Then, the second component assigns an aggregation weight to each detector\u27s decision in each context based on its relative performance within the context. The third component combines the multiple decisions, using the learned weights, to make a final decision. For Context Extraction component, a novel algorithm that performs clustering and feature discrimination is used to cluster the composite feature space and identify the relevant features for each cluster. For the fusion component, six different methods were proposed and investigated. The proposed approached were applied to the problem of landmine detection. Detection and removal of landmines is a serious problem affecting civilians and soldiers worldwide. Several detection algorithms on landmine have been proposed. Extensive testing of these methods has shown that the relative performance of different detectors can vary significantly depending on the mine type, geographical site, soil and weather conditions, and burial depth, etc. Therefore, multi-algorithm, and multi-sensor fusion is a critical component in land mine detection. Results on large and diverse real data collections show that the proposed method can identify meaningful and coherent clusters and that different expert algorithms can be identified for the different contexts. Our experiments have also indicated that the context-dependent fusion outperforms all individual detectors and several global fusion methods

    Partner selection in sustainable supply chains: a fuzzy ensemble learning model

    Get PDF
    With the increasing demands on businesses to operate more sustainably, firms must ensure that the performance of their whole supply chain in sustainability is optimized. As partner selection is critical to supply chain management, focal firms now need to select supply chain partners that can offer a high level of competence in sustainability. This paper proposes a novel multi-partner classification model for the partner qualification and classification process, combining ensemble learning technology and fuzzy set theory. The proposed model enables potential partners to be classified into one of four categories (strategic partner, preference partner, leverage partner and routine partner), thereby allowing distinctive partner management strategies to be applied for each category. The model provides for the simultaneous optimization of both efficiency in its use of multi-partner and multi-dimension evaluation data, and effectiveness in dealing with the vagueness and uncertainty of linguistic commentary data. Compared to more conventional methods, the proposed model has the advantage of offering a simple classification and a stable prediction performance. The practical efficacy of the model is illustrated by an application in a listed electronic equipment and instrument manufacturing company based in southeastern China

    Text Categorization and Machine Learning Methods: Current State Of The Art

    Get PDF
    In this informative age, we find many documents are available in digital forms which need classification of the text. For solving this major problem present researchers focused on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of pre classified documents, the characteristics of the categories. The main benefit of the present approach is consisting in the manual definition of a classifier by domain experts where effectiveness, less use of expert work and straightforward portability to different domains are possible. The paper examines the main approaches to text categorization comparing the machine learning paradigm and present state of the art. Various issues pertaining to three different text similarity problems, namely, semantic, conceptual and contextual are also discussed

    Linear and Order Statistics Combiners for Pattern Classification

    Full text link
    Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the "added" error. If N unbiased classifiers are combined by simple averaging, the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the ith order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.Comment: 31 page
    corecore