Search CORE

7,938 research outputs found

An ant colony-based semi-supervised approach for learning classification rules

Author: A Halder
AA Freitas
C Ginestet
C Hsu
D Angus
D Martens
F Otero
Fernando E. B. Otero
Gisele L. Pappa
I Triguero
J Alcalá-Fdez
J Wang
Julio Albinati
L Rokach
M Li
Samuel E. L. Oliveira
X Zhu
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2015
Field of study

Semi-supervised learning methods create models from a few labeled instances and a great number of unlabeled instances. They appear as a good option in scenarios where there is a lot of unlabeled data and the process of labeling instances is expensive, such as those where most Web applications stand. This paper proposes a semi-supervised self-training algorithm called Ant-Labeler. Self-training algorithms take advantage of supervised learning algorithms to iteratively learn a model from the labeled instances and then use this model to classify unlabeled instances. The instances that receive labels with high confidence are moved from the unlabeled to the labeled set, and this process is repeated until a stopping criteria is met, such as labeling all unlabeled instances. Ant-Labeler uses an ACO algorithm as the supervised learning method in the self-training procedure to generate interpretable rule-based models—used as an ensemble to ensure accurate predictions. The pheromone matrix is reused across different executions of the ACO algorithm to avoid rebuilding the models from scratch every time the labeled set is updated. Results showed that the proposed algorithm obtains better predictive accuracy than three state-of-the-art algorithms in roughly half of the datasets on which it was tested, and the smaller the number of labeled instances, the better the Ant-Labeler performance

Crossref

Kent Academic Repository

Artificial intelligence in the cyber domain: Offense and defense

Author: Diep Quoc Bao
Truong Thanh Cong
Zelinka Ivan
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41

Multidisciplinary Digital Publishing Institute

DSpace at VSB Technical University of Ostrava

Evolutionary Computation, Optimization and Learning Algorithms for Data Science

Author: A Agrawal
A Imteaj
C Blum
D Karaboga
D Wunsch
D Zhang
E Hancer
F Harfouchi
F Zabihi
F Zhuang
FG Mohammadi
FG Mohammadi
FG Mohammadi
H Shi
H Wang
H Yoshida
I Rahman
ILS Russo
J Kennedy
J Kennedy
J Pierezan
J Yang
JR Koza
K Ahmed
K Chen
K Socha
LJ Fogel
MA Abido
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MJLF Cruyff
MM Kabir
N Altman
NP Patel
P Marrow
P Moscato
R Balamurugan
R Vanaja
S Jiang
S Mirjalili
SF Razavi
SL Gupta
SN Karpagam
T Bäck
U Khurana
V Rostami
W Shi
Wai Keen Vong
X Meng
X-B Meng
X-L Li
X-Y Liu
Y Cao
Y Chen
Y Xue
Y Zhang
Z-F Hao
Publication venue: FIU Digital Commons
Publication date: 01/08/2019
Field of study

A large number of engineering, science and computational problems have yet to be solved in a computationally efficient way. One of the emerging challenges is how evolving technologies grow towards autonomy and intelligent decision making. This leads to collection of large amounts of data from various sensing and measurement technologies, e.g., cameras, smart phones, health sensors, smart electricity meters, and environment sensors. Hence, it is imperative to develop efficient algorithms for generation, analysis, classification, and illustration of data. Meanwhile, data is structured purposefully through different representations, such as large-scale networks and graphs. We focus on data science as a crucial area, specifically focusing on a curse of dimensionality (CoD) which is due to the large amount of generated/sensed/collected data. This motivates researchers to think about optimization and to apply nature-inspired algorithms, such as evolutionary algorithms (EAs) to solve optimization problems. Although these algorithms look un-deterministic, they are robust enough to reach an optimal solution. Researchers do not adopt evolutionary algorithms unless they face a problem which is suffering from placement in local optimal solution, rather than global optimal solution. In this chapter, we first develop a clear and formal definition of the CoD problem, next we focus on feature extraction techniques and categories, then we provide a general overview of meta-heuristic algorithms, its terminology, and desirable properties of evolutionary algorithms

arXiv.org e-Print Archive

Crossref

DigitalCommons@Florida International University

Applications of Nature-Inspired Algorithms for Dimension Reduction: Enabling Efficient Data Analytics

Author: A Adeli
A Imteaj
A Zhang
AH Hamamoto
AI Hafez
C Yan
E Hancer
F Harfouchi
F Xie
FG Mohammadi
FG Mohammadi
FG Mohammadi
FG Mohammadi
H Peng
H Rao
H Shi
H Shi
H Wang
J Kaur
J Pierezan
K Ahmed
K Kira
L Ke
M Gong
M Kumari
M Tubishat
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MH Amini
MM Kabir
Mohamed Abd Elaziz
MR Mozafar
N Kozodoi
P Moradi
Q Al-Tashi
Q-T Bui
R Hang
R Vanaja
RR Chhikara
S Arora
S Gupta
S Khan
S Roy
S Tabakhi
V Rostami
X-L Li
X-Y Liu
Y Cao
Y Dong
Y Pathak
Y Xue
Y Zhang
Publication venue: FIU Digital Commons
Publication date: 01/08/2019
Field of study

In [1], we have explored the theoretical aspects of feature selection and evolutionary algorithms. In this chapter, we focus on optimization algorithms for enhancing data analytic process, i.e., we propose to explore applications of nature-inspired algorithms in data science. Feature selection optimization is a hybrid approach leveraging feature selection techniques and evolutionary algorithms process to optimize the selected features. Prior works solve this problem iteratively to converge to an optimal feature subset. Feature selection optimization is a non-specific domain approach. Data scientists mainly attempt to find an advanced way to analyze data n with high computational efficiency and low time complexity, leading to efficient data analytics. Thus, by increasing generated/measured/sensed data from various sources, analysis, manipulation and illustration of data grow exponentially. Due to the large scale data sets, Curse of dimensionality (CoD) is one of the NP-hard problems in data science. Hence, several efforts have been focused on leveraging evolutionary algorithms (EAs) to address the complex issues in large scale data analytics problems. Dimension reduction, together with EAs, lends itself to solve CoD and solve complex problems, in terms of time complexity, efficiently. In this chapter, we first provide a brief overview of previous studies that focused on solving CoD using feature extraction optimization process. We then discuss practical examples of research studies are successfully tackled some application domains, such as image processing, sentiment analysis, network traffics / anomalies analysis, credit score analysis and other benchmark functions/data sets analysis

arXiv.org e-Print Archive

Crossref

DigitalCommons@Florida International University

Dynamic feature selection for clustering high dimensional data streams

Author: Fahy Conor
Yang Shengxiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/07/2019
Field of study

open access articleChange in a data stream can occur at the concept level and at the feature level. Change at the feature level can occur if new, additional features appear in the stream or if the importance and relevance of a feature changes as the stream progresses. This type of change has not received as much attention as concept-level change. Furthermore, a lot of the methods proposed for clustering streams (density-based, graph-based, and grid-based) rely on some form of distance as a similarity metric and this is problematic in high-dimensional data where the curse of dimensionality renders distance measurements and any concept of “density” difficult. To address these two challenges we propose combining them and framing the problem as a feature selection problem, specifically a dynamic feature selection problem. We propose a dynamic feature mask for clustering high dimensional data streams. Redundant features are masked and clustering is performed along unmasked, relevant features. If a feature's perceived importance changes, the mask is updated accordingly; previously unimportant features are unmasked and features which lose relevance become masked. The proposed method is algorithm-independent and can be used with any of the existing density-based clustering algorithms which typically do not have a mechanism for dealing with feature drift and struggle with high-dimensional data. We evaluate the proposed method on four density-based clustering algorithms across four high-dimensional streams; two text streams and two image streams. In each case, the proposed dynamic feature mask improves clustering performance and reduces the processing time required by the underlying algorithm. Furthermore, change at the feature level can be observed and tracked

De Montfort University Open Research Archive

Examining Swarm Intelligence-based Feature Selection for Multi-Label Classification

Author: Abdulazeez Adnan Mohsin
Ahmed Awder
Publication venue: 'Penerbit UTHM'
Publication date: 27/10/2021
Field of study

Multi-label classification addresses the issues that more than one class label assigns to each instance. Many real-world multi-label classification tasks are high-dimensional due to digital technologies, leading to reduced performance of traditional multi-label classifiers. Feature selection is a common and successful approach to tackling this problem by retaining relevant features and eliminating redundant ones to reduce dimensionality. There is several feature selection that is successfully applied in multi-label learning. Most of those features are wrapper methods that employ a multi-label classifier in their processes. They run a classifier in each step, which requires a high computational cost, and thus they suffer from scalability issues. To deal with this issue, filter methods are introduced to evaluate the feature subsets using information-theoretic mechanisms instead of running classifiers. This paper aims to provide a comprehensive review of different methods of feature selection presented for the tasks of multi-label classification. To this end, in this review, we have investigated most of the well-known and state-of-the-art methods. We then provided the main characteristics of the existing multi-label feature selection techniques and compared them analytically

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)