Search CORE

2 research outputs found

Bayesian hyper-parameter optimisation for malware detection

Author: ALGorain F.T.
Clark J.A.
Publication venue: 'MDPI AG'
Publication date: 01/05/2022
Field of study

Malware detection is a major security concern and has been the subject of a great deal of research and development. Machine learning is a natural technology for addressing malware detection, and many researchers have investigated its use. However, the performance of machine learning algorithms often depends significantly on parametric choices, so the question arises as to what parameter choices are optimal. In this paper, we investigate how best to tune the parameters of machine learning algorithms—a process generally known as hyper-parameter optimisation—in the context of malware detection. We examine the effects of some simple (model-free) ways of parameter tuning together with a state-of-the-art Bayesian model-building approach. Our work is carried out using Ember, a major published malware benchmark dataset of Windows Portable Execution metadata samples, and a smaller dataset from kaggle.com (also comprising Windows Portable Execution metadata). We demonstrate that optimal parameter choices may differ significantly from default choices and argue that hyper-parameter optimisation should be adopted as a ‘formal outer loop’ in the research and development of malware detection systems. We also argue that doing so is essential for the development of the discipline since it facilitates a fair comparison of competing machine learning algorithms applied to the malware detection problem

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

White Rose Research Online

A Framework for Vector-Weighted Deep Neural Networks

Author: Chiu Carter
Publication venue: Digital Scholarship@UNLV
Publication date: 01/05/2020
Field of study

The vast majority of advances in deep neural network research operate on the basis of a real-valued weight space. Recent work in alternative spaces have challenged and complemented this idea; for instance, the use of complex- or binary-valued weights have yielded promising and fascinating results. We propose a framework for a novel weight space consisting of vector values which we christen VectorNet. We first develop the theoretical foundations of our proposed approach, including formalizing the requisite theory for forward and backpropagating values in a vector-weighted layer. We also introduce the concept of expansion and aggregation functions for conversion between real and vector values. These contributions enable the seamless integration of vector-weighted layers with conventional layers, resulting in network architectures exhibiting height in addition to width and depth, and consequently models which we might be inclined to call tall learning. As a means of evaluating its effect on model performance, we apply our framework on top of three neural network architectural families—the multilayer perceptron (MLP), convolutional neural network (CNN), and directed acyclic graph neural network (DAG-NN)—trained over multiple classic machine learning and image classification benchmarks. We also consider evolutionary algorithms for performing neural architecture search over the new hyperparameters introduced by our framework. Lastly, we solidify the case for the utility of our contributions by implementing our approach on real-world data in the domains of mental illness diagnosis and static malware detection, achieving state-of-the-art results in both. Our implementations are made publicly available to drive further investigation into the exciting potential of VectorNet

University of Nevada, Las Vegas Repository