8,759 research outputs found
Improving Classification Performance through an Advanced Ensemble Based Heterogeneous Extreme Learning Machines
Extreme Learning Machine (ELM) is a fast-learning algorithm for a single-hidden layer feedforward neural network (SLFN). It often has good generalization performance. However, there are chances that it might overfit the training data due to having more hidden nodes than needed. To address the generalization performance, we use a heterogeneous ensemble approach. We propose an Advanced ELM Ensemble (AELME) for classification, which includes Regularized-ELM, L2-norm-optimized ELM (ELML2), and Kernel-ELM. The ensemble is constructed by training a randomly chosen ELM classifier on a subset of training data selected through random resampling. The proposed AELM-Ensemble is evolved by employing an objective function of increasing diversity and accuracy among the final ensemble. Finally, the class label of unseen data is predicted using majority vote approach. Splitting the training data into subsets and incorporation of heterogeneous ELM classifiers result in higher prediction accuracy, better generalization, and a lower number of base classifiers, as compared to other models (Adaboost, Bagging, Dynamic ELM ensemble, data splitting ELM ensemble, and ELM ensemble). The validity of AELME is confirmed through classification on several real-world benchmark datasets
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Recommended from our members
A novel improved model for building energy consumption prediction based on model integration
Building energy consumption prediction plays an irreplaceable role in energy planning, management, and conservation. Constantly improving the performance of prediction models is the key to ensuring the efficient operation of energy systems. Moreover, accuracy is no longer the only factor in revealing model performance, it is more important to evaluate the model from multiple perspectives, considering the characteristics of engineering applications. Based on the idea of model integration, this paper proposes a novel improved integration model (stacking model) that can be used to forecast building energy consumption. The stacking model combines advantages of various base prediction algorithms and forms them into “meta-features” to ensure that the final model can observe datasets from different spatial and structural angles. Two cases are used to demonstrate practical engineering applications of the stacking model. A comparative analysis is performed to evaluate the prediction performance of the stacking model in contrast with existing well-known prediction models including Random Forest, Gradient Boosted Decision Tree, Extreme Gradient Boosting, Support Vector Machine, and K-Nearest Neighbor. The results indicate that the stacking method achieves better performance than other models, regarding accuracy (improvement of 9.5%–31.6% for Case A and 16.2%–49.4% for Case B), generalization (improvement of 6.7%–29.5% for Case A and 7.1%-34.6% for Case B), and robustness (improvement of 1.5%–34.1% for Case A and 1.8%–19.3% for Case B). The proposed model enriches the diversity of algorithm libraries of empirical models
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Artificial intelligence in the cyber domain: Offense and defense
Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41
Investigating Deep Learning Model Calibration for Classification Problems in Mechanics
Recently, there has been a growing interest in applying machine learning
methods to problems in engineering mechanics. In particular, there has been
significant interest in applying deep learning techniques to predicting the
mechanical behavior of heterogeneous materials and structures. Researchers have
shown that deep learning methods are able to effectively predict mechanical
behavior with low error for systems ranging from engineered composites, to
geometrically complex metamaterials, to heterogeneous biological tissue.
However, there has been comparatively little attention paid to deep learning
model calibration, i.e., the match between predicted probabilities of outcomes
and the true probabilities of outcomes. In this work, we perform a
comprehensive investigation into ML model calibration across seven open access
engineering mechanics datasets that cover three distinct types of mechanical
problems. Specifically, we evaluate both model and model calibration error for
multiple machine learning methods, and investigate the influence of ensemble
averaging and post hoc model calibration via temperature scaling. Overall, we
find that ensemble averaging of deep neural networks is both an effective and
consistent tool for improving model calibration, while temperature scaling has
comparatively limited benefits. Looking forward, we anticipate that this
investigation will lay the foundation for future work in developing mechanics
specific approaches to deep learning model calibration.Comment: 21 pages, 9 figure
A Literature Review of Fault Diagnosis Based on Ensemble Learning
The accuracy of fault diagnosis is an important indicator to ensure the reliability of key equipment systems. Ensemble learning integrates different weak learning methods to obtain stronger learning and has achieved remarkable results in the field of fault diagnosis. This paper reviews the recent research on ensemble learning from both technical and field application perspectives. The paper summarizes 87 journals in recent web of science and other academic resources, with a total of 209 papers. It summarizes 78 different ensemble learning based fault diagnosis methods, involving 18 public datasets and more than 20 different equipment systems. In detail, the paper summarizes the accuracy rates, fault classification types, fault datasets, used data signals, learners (traditional machine learning or deep learning-based learners), ensemble learning methods (bagging, boosting, stacking and other ensemble models) of these fault diagnosis models. The paper uses accuracy of fault diagnosis as the main evaluation metrics supplemented by generalization and imbalanced data processing ability to evaluate the performance of those ensemble learning methods. The discussion and evaluation of these methods lead to valuable research references in identifying and developing appropriate intelligent fault diagnosis models for various equipment. This paper also discusses and explores the technical challenges, lessons learned from the review and future development directions in the field of ensemble learning based fault diagnosis and intelligent maintenance
The Challenge of Non-Technical Loss Detection using Artificial Intelligence: A Survey
Detection of non-technical losses (NTL) which include electricity theft,
faulty meters or billing errors has attracted increasing attention from
researchers in electrical engineering and computer science. NTLs cause
significant harm to the economy, as in some countries they may range up to 40%
of the total electricity distributed. The predominant research direction is
employing artificial intelligence to predict whether a customer causes NTL.
This paper first provides an overview of how NTLs are defined and their impact
on economies, which include loss of revenue and profit of electricity providers
and decrease of the stability and reliability of electrical power grids. It
then surveys the state-of-the-art research efforts in a up-to-date and
comprehensive review of algorithms, features and data sets used. It finally
identifies the key scientific and engineering challenges in NTL detection and
suggests how they could be addressed in the future
- …