4 research outputs found

    Classifier-Independent Feature Selection for Two-Stage Feature Selection

    No full text
    . The effectiveness of classifier-independent feature selection is described. The aim is to remove garbage features and to improve the classification accuracy of all the practical classifiers compared with the situation where all the given features are used. Two algorithms of classifier-independent feature selection and two other conventional classifier-specific algorithms are compared on three sets of real data. In addition, two-stage feature selection is proposed. 1 Introduction Feature selection is one of the most important issues in pattern recognition. The process is very useful for (1) reducing of the cost of extracting features, (2) improving of the classification accuracy of a practical classifier, and (3) improving the reliability of the estimation of the performance. A large number of algorithms have been proposed for feature selection, and some comparative studies [1, 2, 3] have also been carried out. Most algorithms for feature selection use a criterion based on a specific ..

    A machine learning-based investigation of cloud service attacks

    Get PDF
    In this thesis, the security challenges of cloud computing are investigated in the Infrastructure as a Service (IaaS) layer, as security is one of the major concerns related to Cloud services. As IaaS consists of different security terms, the research has been further narrowed down to focus on Network Layer Security. Review of existing research revealed that several types of attacks and threats can affect cloud security. Therefore, there is a need for intrusion defence implementations to protect cloud services. Intrusion Detection (ID) is one of the most effective solutions for reacting to cloud network attacks. [Continues.

    Model-based Biomarker Detection and Systematic Analysis in Translational Science

    Get PDF
    This dissertation is concerned with the application of mathematical modeling and statistical signal processing into the rapidly expanding fields of proteomics and genomics. The research is guided by a translational goal which drives the problem formalization and experimental design, and leads to optimization, prediction and control of the underlying system. The dissertation is comprised of three interconnected subjects. In the first part of the dissertation, two Bayesian peptide detection algorithms are proposed to optimize the feature extraction step, which is the most fundamental step in mass spectrometry-based proteomics. The algorithms are designed to tackle data processing challenges that are not satisfactorily addressed by existing methods. In contrast to most existing methods, the proposed algorithms perform deisotoping and deconvolution of mass spectra simultaneously, which enables better identification of weak peptide signals. Unlike greedy template-matching algorithms, the proposed methods have the capability to handle complex spectra where features overlap. The proposed methods achieve better sensitivity and accuracy compared to many popular software packages such as msInspect. In the second part of the dissertation, we consider modeling and assessing the entire mass spectrometry-based proteomic data analysis pipeline. Different modules are identified and analyzed, resulting in a framework that captures key factors in system performance. The effects of various model parameters on protein identification rates and quantification errors, differential expression results, and classification performance are examined. The proposed pipeline model can be used to aid experimental design, pinpoint critical bottlenecks, optimize the work flow, and predict biomarker discovery results. Finally, the same system methodology is extended to analyze the work flow in DNA microarray experiments. A model-based approach is developed to explore the relationship among microarray data properties, missing value imputation, and sample classification in a complicated data analysis pipeline. The situations when it is suitable to apply missing value imputation are identified and recommendations regarding imputation are provided. In addition, a missing value rate-related peaking phenomenon is uncovered
    corecore