85,115 research outputs found
A scalable saliency-based Feature selection method with instance level information
Classic feature selection techniques remove those features that are either
irrelevant or redundant, achieving a subset of relevant features that help to
provide a better knowledge extraction. This allows the creation of compact
models that are easier to interpret. Most of these techniques work over the
whole dataset, but they are unable to provide the user with successful
information when only instance information is needed. In short, given any
example, classic feature selection algorithms do not give any information about
which the most relevant information is, regarding this sample. This work aims
to overcome this handicap by developing a novel feature selection method,
called Saliency-based Feature Selection (SFS), based in deep-learning saliency
techniques. Our experimental results will prove that this algorithm can be
successfully used not only in Neural Networks, but also under any given
architecture trained by using Gradient Descent techniques
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Comparison of Deep Learning and the Classical Machine Learning Algorithm for the Malware Detection
Recently, Deep Learning has been showing promising results in various
Artificial Intelligence applications like image recognition, natural language
processing, language modeling, neural machine translation, etc. Although, in
general, it is computationally more expensive as compared to classical machine
learning techniques, their results are found to be more effective in some
cases. Therefore, in this paper, we investigated and compared one of the Deep
Learning Architecture called Deep Neural Network (DNN) with the classical
Random Forest (RF) machine learning algorithm for the malware classification.
We studied the performance of the classical RF and DNN with 2, 4 & 7 layers
architectures with the four different feature sets, and found that irrespective
of the features inputs, the classical RF accuracy outperforms the DNN.Comment: 11 Pages, 1 figur
An investigation of a deep learning based malware detection system
We investigate a Deep Learning based system for malware detection. In the
investigation, we experiment with different combination of Deep Learning
architectures including Auto-Encoders, and Deep Neural Networks with varying
layers over Malicia malware dataset on which earlier studies have obtained an
accuracy of (98%) with an acceptable False Positive Rates (1.07%). But these
results were done using extensive man-made custom domain features and investing
corresponding feature engineering and design efforts. In our proposed approach,
besides improving the previous best results (99.21% accuracy and a False
Positive Rate of 0.19%) indicates that Deep Learning based systems could
deliver an effective defense against malware. Since it is good in automatically
extracting higher conceptual features from the data, Deep Learning based
systems could provide an effective, general and scalable mechanism for
detection of existing and unknown malware.Comment: 13 Pages, 4 figure
Higher order feature extraction and selection for robust human gesture recognition using CSI of COTS Wi-Fi devices
Device-free human gesture recognition (HGR) using commercial o the shelf (COTS) Wi-Fi
devices has gained attention with recent advances in wireless technology. HGR recognizes the human
activity performed, by capturing the reflections ofWi-Fi signals from moving humans and storing
them as raw channel state information (CSI) traces. Existing work on HGR applies noise reduction
and transformation to pre-process the raw CSI traces. However, these methods fail to capture
the non-Gaussian information in the raw CSI data due to its limitation to deal with linear signal
representation alone. The proposed higher order statistics-based recognition (HOS-Re) model extracts
higher order statistical (HOS) features from raw CSI traces and selects a robust feature subset for the
recognition task. HOS-Re addresses the limitations in the existing methods, by extracting third order
cumulant features that maximizes the recognition accuracy. Subsequently, feature selection methods
derived from information theory construct a robust and highly informative feature subset, fed as
input to the multilevel support vector machine (SVM) classifier in order to measure the performance.
The proposed methodology is validated using a public database SignFi, consisting of 276 gestures
with 8280 gesture instances, out of which 5520 are from the laboratory and 2760 from the home
environment using a 10 5 cross-validation. HOS-Re achieved an average recognition accuracy of
97.84%, 98.26% and 96.34% for the lab, home and lab + home environment respectively. The average
recognition accuracy for 150 sign gestures with 7500 instances, collected from five di erent users was
96.23% in the laboratory environment.Taylor's University through its TAYLOR'S PhD SCHOLARSHIP Programmeinfo:eu-repo/semantics/publishedVersio
- …