Search CORE

1,621 research outputs found

Sparse neural networks with large learning diversity

Author: Berrou Claude
Gripon Vincent
Publication venue
Publication date: 21/02/2011
Field of study

Coded recurrent neural networks with three levels of sparsity are introduced. The first level is related to the size of messages, much smaller than the number of available neurons. The second one is provided by a particular coding rule, acting as a local constraint in the neural activity. The third one is a characteristic of the low final connection density of the network after the learning phase. Though the proposed network is very simple since it is based on binary neurons and binary connections, it is able to learn a large number of messages and recall them, even in presence of strong erasures. The performance of the network is assessed as a classifier and as an associative memory

arXiv.org e-Print Archive

HAL-Université de Bretagne Occidentale

HAL Descartes

FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification

Author: CP Chen
G Qin
J Cargile
J West
JA Evans
JC Bezdek
K Kowsari
L Bahl
M Russo
MJ Prabu
R Vilalta
R Wieland
RAR Ashfaq
S-S Choi
X Jiang
X Qiu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/11/2017
Field of study

This paper introduces a novel real-time Fuzzy Supervised Learning with Binary Meta-Feature (FSL-BM) for big data classification task. The study of real-time algorithms addresses several major concerns, which are namely: accuracy, memory consumption, and ability to stretch assumptions and time complexity. Attaining a fast computational model providing fuzzy logic and supervised learning is one of the main challenges in the machine learning. In this research paper, we present FSL-BM algorithm as an efficient solution of supervised learning with fuzzy logic processing using binary meta-feature representation using Hamming Distance and Hash function to relax assumptions. While many studies focused on reducing time complexity and increasing accuracy during the last decade, the novel contribution of this proposed solution comes through integration of Hamming Distance, Hash function, binary meta-features, binary classification to provide real time supervised method. Hash Tables (HT) component gives a fast access to existing indices; and therefore, the generation of new indices in a constant time complexity, which supersedes existing fuzzy supervised algorithms with better or comparable results. To summarize, the main contribution of this technique for real-time Fuzzy Supervised Learning is to represent hypothesis through binary input as meta-feature space and creating the Fuzzy Supervised Hash table to train and validate model.Comment: FICC201

arXiv.org e-Print Archive

Crossref

Dynamic Data Mining: Methodology and Algorithms

Author: Deng Xiong
Deng Xiong
Publication venue: Computing, Imperial College London
Publication date: 01/07/2011
Field of study

Supervised data stream mining has become an important and challenging data mining task in modern organizations. The key challenges are threefold: (1) a possibly infinite number of streaming examples and time-critical analysis constraints; (2) concept drift; and (3) skewed data distributions. To address these three challenges, this thesis proposes the novel dynamic data mining (DDM) methodology by effectively applying supervised ensemble models to data stream mining. DDM can be loosely defined as categorization-organization-selection of supervised ensemble models. It is inspired by the idea that although the underlying concepts in a data stream are time-varying, their distinctions can be identified. Therefore, the models trained on the distinct concepts can be dynamically selected in order to classify incoming examples of similar concepts. First, following the general paradigm of DDM, we examine the different concept-drifting stream mining scenarios and propose corresponding effective and efficient data mining algorithms. • To address concept drift caused merely by changes of variable distributions, which we term pseudo concept drift, base models built on categorized streaming data are organized and selected in line with their corresponding variable distribution characteristics. • To address concept drift caused by changes of variable and class joint distributions, which we term true concept drift, an effective data categorization scheme is introduced. A group of working models is dynamically organized and selected for reacting to the drifting concept. Secondly, we introduce an integration stream mining framework, enabling the paradigm advocated by DDM to be widely applicable for other stream mining problems. Therefore, we are able to introduce easily six effective algorithms for mining data streams with skewed class distributions. In addition, we also introduce a new ensemble model approach for batch learning, following the same methodology. Both theoretical and empirical studies demonstrate its effectiveness. Future work would be targeted at improving the effectiveness and efficiency of the proposed algorithms. Meantime, we would explore the possibilities of using the integration framework to solve other open stream mining research problems

Spiral - Imperial College Digital Repository

AI for Cybersecurity: Robust models for Authentication, Threat and Anomaly Detection

Author: [Olga Tushkanova et al.]
Bergadano Francesco
Giacinto Giorgio
Publication venue: place:Basel
Publication date: 01/01/2023
Field of study

Archivio istituzionale della ricerca - Università di Cagliari

Dos and Don'ts of Machine Learning in Computer Security

Author: Arp Daniel
Cavallaro Lorenzo
Pendlebury Feargus
Pierazzi Fabio
Quiring Erwin
Rieck Konrad
Warnecke Alexander
Wressnegger Christian
Publication venue
Publication date: 30/11/2021
Field of study

With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment. In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these pitfalls are widespread in the current security literature. In an empirical analysis, we further demonstrate how individual pitfalls can lead to unrealistic performance and interpretations, obstructing the understanding of the security problem at hand. As a remedy, we propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible. Furthermore, we identify open problems when applying machine learning in security and provide directions for further research.Comment: to appear at USENIX Security Symposium 202

arXiv.org e-Print Archive

UCL Discovery

A Machine Learning Approach for Intrusion Detection

Author: Abohaikel Amir Sadiq
Tokheim Erik
Publication venue: 'University of Agder'
Publication date: 01/01/2020
Field of study

Master's thesis in Information- and communication technology (IKT590)Securing networks and their confidentiality from intrusions is crucial, and for this rea-son, Intrusion Detection Systems have to be employed. The main goal of this thesis is to achieve a proper detection performance of a Network Intrusion Detection System (NIDS). In this thesis, we have examined the detection efficiency of machine learning algorithms such as Neural Network, Convolutional Neural Network, Random Forestand Long Short-Term Memory. We have constructed our models so that they can detect different types of attacks utilizing the CICIDS2017 dataset. We have worked on identifying 15 various attacks present in CICIDS2017, instead of merely identifying normal-abnormal traffic. We have also discussed the reason why to use precisely this dataset, and why should one classify by attack to enhance the detection. Previous works based on benchmark datasets such as NSL-KDD and KDD99 are discussed. Also, how to address and solve these issues. The thesis also shows how the results are effected using different machine learning algorithms. As the research will demon-strate, the Neural Network, Convulotional Neural Network, Random Forest and Long Short-Term Memory are evaluated by conducting cross validation; the average score across five folds of each model is at 92.30%, 87.73%, 94.42% and 87.94%, respectively. Nevertheless, the confusion metrics was also a crucial measurement to evaluate the models, as we shall see. Keywords: Information security, NIDS, Machine Learning, Neural Network, Convolutional Neural Network, Random Forest, Long Short-Term Memory, CICIDS2017

Agder University Research Archive