Search CORE

85,115 research outputs found

A scalable saliency-based Feature selection method with instance level information

Author: Alonso-Betanzos Amparo
Bolón-Canedo Verónica
Cancela Brais
Gama João
Publication venue: 'Elsevier BV'
Publication date: 30/04/2019
Field of study

Classic feature selection techniques remove those features that are either irrelevant or redundant, achieving a subset of relevant features that help to provide a better knowledge extraction. This allows the creation of compact models that are easier to interpret. Most of these techniques work over the whole dataset, but they are unable to provide the user with successful information when only instance information is needed. In short, given any example, classic feature selection algorithms do not give any information about which the most relevant information is, regarding this sample. This work aims to overcome this handicap by developing a novel feature selection method, called Saliency-based Feature Selection (SFS), based in deep-learning saliency techniques. Our experimental results will prove that this algorithm can be successfully used not only in Neural Networks, but also under any given architecture trained by using Gradient Descent techniques

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña

Machine Learning and Integrative Analysis of Biomedical Big Data.

Author: Choi Howard
Chung Neo Christopher
Mirza Bilal
Ping Peipei
Wang Jie
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

Multidisciplinary Digital Publishing Institute

Ezid

Directory of Open Access Journals

eScholarship - University of California

Comparison of Deep Learning and the Classical Machine Learning Algorithm for the Malware Detection

Author: Rathore Hemant
Sahay Sanjay K.
Sewak Mohit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/09/2018
Field of study

Recently, Deep Learning has been showing promising results in various Artificial Intelligence applications like image recognition, natural language processing, language modeling, neural machine translation, etc. Although, in general, it is computationally more expensive as compared to classical machine learning techniques, their results are found to be more effective in some cases. Therefore, in this paper, we investigated and compared one of the Deep Learning Architecture called Deep Neural Network (DNN) with the classical Random Forest (RF) machine learning algorithm for the malware classification. We studied the performance of the classical RF and DNN with 2, 4 & 7 layers architectures with the four different feature sets, and found that irrespective of the features inputs, the classical RF accuracy outperforms the DNN.Comment: 11 Pages, 1 figur

arXiv.org e-Print Archive

Crossref

An investigation of a deep learning based malware detection system

Author: David O. E.
Kolosnjaji Bojan
Lin Chih-Ta
Pascanu R.
Sahay Sanjay K.
Siddiqui Muazzam
Srivastava Nitish
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/09/2018
Field of study

We investigate a Deep Learning based system for malware detection. In the investigation, we experiment with different combination of Deep Learning architectures including Auto-Encoders, and Deep Neural Networks with varying layers over Malicia malware dataset on which earlier studies have obtained an accuracy of (98%) with an acceptable False Positive Rates (1.07%). But these results were done using extensive man-made custom domain features and investing corresponding feature engineering and design efforts. In our proposed approach, besides improving the previous best results (99.21% accuracy and a False Positive Rate of 0.19%) indicates that Deep Learning based systems could deliver an effective defense against malware. Since it is good in automatically extracting higher conceptual features from the data, Deep Learning based systems could provide an effective, general and scalable mechanism for detection of existing and unknown malware.Comment: 13 Pages, 4 figure

arXiv.org e-Print Archive

Crossref

Higher order feature extraction and selection for robust human gesture recognition using CSI of COTS Wi-Fi devices

Author: Ahmad Hafisoh
Ahmed Hasmath Farhana
Harkat Houda
Narasingamurthi Kulasekharan
Phang Swee King
Vaithilingam Chockalingam
Publication venue: 'MDPI AG'
Publication date: 04/07/2019
Field of study

Device-free human gesture recognition (HGR) using commercial o the shelf (COTS) Wi-Fi devices has gained attention with recent advances in wireless technology. HGR recognizes the human activity performed, by capturing the reflections ofWi-Fi signals from moving humans and storing them as raw channel state information (CSI) traces. Existing work on HGR applies noise reduction and transformation to pre-process the raw CSI traces. However, these methods fail to capture the non-Gaussian information in the raw CSI data due to its limitation to deal with linear signal representation alone. The proposed higher order statistics-based recognition (HOS-Re) model extracts higher order statistical (HOS) features from raw CSI traces and selects a robust feature subset for the recognition task. HOS-Re addresses the limitations in the existing methods, by extracting third order cumulant features that maximizes the recognition accuracy. Subsequently, feature selection methods derived from information theory construct a robust and highly informative feature subset, fed as input to the multilevel support vector machine (SVM) classifier in order to measure the performance. The proposed methodology is validated using a public database SignFi, consisting of 276 gestures with 8280 gesture instances, out of which 5520 are from the laboratory and 2760 from the home environment using a 10 5 cross-validation. HOS-Re achieved an average recognition accuracy of 97.84%, 98.26% and 96.34% for the lab, home and lab + home environment respectively. The average recognition accuracy for 150 sign gestures with 7500 instances, collected from five di erent users was 96.23% in the laboratory environment.Taylor's University through its TAYLOR'S PhD SCHOLARSHIP Programmeinfo:eu-repo/semantics/publishedVersio

Multidisciplinary Digital Publishing Institute

Sapientia