Search CORE

1,337 research outputs found

Recommended from our members

Accelerating Cyber-Breach Investigations through Novel use of Artificial Immune System Algorithms

Author: Donnachie Benjamin
Hopgood Adrian
Kennedy Ian
Verrall Jason
Wong Patrick
Publication venue
Publication date: 01/01/2022
Field of study

The use of artificial immune systems for investigation of cyber-security breaches is presented. Manual reviews of disk images are impractical because of the size of the dataset. Machine-learning algorithms for detection of misuse require labelled training data, which are generally unavailable. They are also necessarily retrospective, so they are unlikely to detect new forms of intrusion. For those reasons, this article proposes the use of artificial immune systems for unsupervised anomaly detection. Specifically, a deterministic dendritic cell algorithm (dDCA) has been implemented that has successfully detected automated SQL injection attacks from sample disk images. For comparison, it outperformed an unsupervised k-means clustering algorithm. However, many significant anomalies were not detected, so further work is required to refine the algorithm using more extensive datasets, and to encode complementary expert knowledge

Open Research Online (The Open University)

Neuromodulatory Supervised Learning

Author: Finnis James
Publication venue
Publication date: 01/01/2020
Field of study

Aberystwyth Research Portal

TimeCluster: dimension reduction applied to temporal data for visual analytics

Author: Ali Mohammed
Jones Mark W.
Williams Mark
Xie Xianghua
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

With the increase of temporal data, there is a growing need for advanced solutions which assist users to understand such data, observe its changes over the time, find repeated patterns, detect outliers, and effectively label data instances in long time-series data. Although these tasks are quite distinct, and are usually tackled separately, we present an interactive visual analytics system and approach that can address these issues in a single system. It enables users to visualize, understand and explore univariate or multivariate long time-series data in one image using a connected scatter plot. It supports interactive analysis and exploration for pattern discovery and outlier detection. Different dimensionality reduction techniques are used and compared in our system. Because of its power of extracting features, deep learning is used for multivariate time-series along with 2D reduction techniques for rapid and easy interpretation and interaction with large amount of time-series data. We deploy our system with different time-series datasets and report two real-world case studies that are used to evaluate our system

University of South Wales Research Explorer

Cronfa at Swansea University

An overview of population-based algorithms for multi-objective optimisation

Author: Bäck T.
Bäck T.
Coello C.
Colorni A.
Darwin C.
Deb K.
Dorigo M.
Edgeworth F.
Fabozzi F.
Fonseca C.
Fourman M.
Gu F.
Halmos P.
Holland J.
Hughes E.J.
Igel C.
Ioannis Giagkiozis
Ishibuchi H.
Jaszkiewicz A.
Jiang S.
Jin Y.
Keijzer M.
Kennedy J.
Koza J.
Lampinen J.
Laumanns M.
Miettinen K.
Ocenasek J.
Ono I.
Pareto V.
Parham P.
Pelikan M.
Peter J. Fleming
Purshouse R.
Rechenberg I.
Reyes-Sierra M.
Robin C. Purshouse
Sastry K.
Sastry K.
Schaffer J.
Schwefel H.
Schwefel H.
Storn R.
Tapia M.
Teytaud O.
Thierens D.
Voigt H.
Wagner S.
Wang R.
Zitzler E.
Publication venue: 'Informa UK Limited'
Publication date: 20/08/2013
Field of study

In this work we present an overview of the most prominent population-based algorithms and the methodologies used to extend them to multiple objective problems. Although not exact in the mathematical sense, it has long been recognised that population-based multi-objective optimisation techniques for real-world applications are immensely valuable and versatile. These techniques are usually employed when exact optimisation methods are not easily applicable or simply when, due to sheer complexity, such techniques could potentially be very costly. Another advantage is that since a population of decision vectors is considered in each generation these algorithms are implicitly parallelisable and can generate an approximation of the entire Pareto front at each iteration. A critique of their capabilities is also provided

Crossref

White Rose Research Online

Unsupervised Machine Learning Algorithms to Characterize Single-Cell Heterogeneity and Perturbation Response

Author: Burkhardt Daniel Bernard
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 01/04/2021
Field of study

Recent advances in microfluidic technologies facilitate the measurement of gene expression, DNA accessibility, protein content, or genomic mutations at unprecedented scale. The challenges imposed by the scale of these datasets are further exacerbated by non-linearity in molecular effects, complex interdependencies between features, and a lack of understanding of both data generating processes and sources of technical and biological noise. As a result, analysis of modern single-cell data requires the development of specialized computational tools. One solution to these problems is the use of manifold learning, a sub-field of unsupervised machine learning that seeks to model data geometry using a simplifying assumption that the underlying system is continuous and locally Euclidean. In this dissertation, I show how manifold learning is naturally suited for single-cell analysis and introduce three related algorithms for characterization of single-cell heterogeneity and perturbation response. I first describe Vertex Frequency Clustering, an algorithm that identifies groups of cells with similar responses to an experiment perturbation by analyzing the spectral representation of condition labels expressed as signals over a cell similarity graph. Next, I introduce MELD, an algorithm that expands on these ideas to estimate the density of each experimental sample over the graph to quantify the effect of an experimental perturbation at single cell resolution. Finally, I describe a neural network for archetypal analysis that represents the data as continuously distributed between a set of extrema. Each of these algorithms are demonstrated on a combination of real and synthetic datasets and are benchmarked against state-of-the-art algorithms

Yale University

Big Data Security (Volume 3)

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 10/02/2021
Field of study

After a short description of the key concepts of big data the book explores on the secrecy and security threats posed especially by cloud based data storage. It delivers conceptual frameworks and models along with case studies of recent technology

Directory of Open Access Books (DOAB)

Spatio-temporal traffic anomaly detection for urban networks

Author: Zhu Lin
Publication venue: Civil and Environmental Engineering, Imperial College London
Publication date: 01/01/2020
Field of study

Urban road networks are often affected by disruptions such as accidents and roadworks, giving rise to congestion and delays, which can, in turn, create a wide range of negative impacts to the economy, environment, safety and security. Accurate detection of the onset of traffic anomalies, specifically Recurrent Congestion (RC) and Nonrecurrent Congestion (NRC) in the traffic networks, is an important ITS function to facilitate proactive intervention measures to reduce the level of severity of congestion. A substantial body of literature is dedicated to models with varying levels of complexity that attempt to identify such anomalies. Given the complexity of the problem, however, very less effort is dedicated to the development of methods that attempt to detect traffic anomalies using spatio-temporal features. Driven both by the recent advances in deep learning techniques and the development of Traffic Incident Management Systems (TIMS), the aim of this research is to develop novel traffic anomaly detection models that can incorporate both spatial and temporal traffic information to detect traffic anomalies at a network level. This thesis first reviews the state of the art in traffic anomaly detection techniques, including the existing methods and emerging machine learning and deep learning methods, before identifying the gaps in the current understanding of traffic anomaly and its detection. One of the problems in terms of adapting the deep learning models to traffic anomaly detection is the translation of time series traffic data from multiple locations to the format necessary for the deep learning model to learn the spatial and temporal features effectively. To address this challenging problem and build a systematic traffic anomaly detection method at a network level, this thesis proposes a methodological framework consisting of (a) the translation layer (which is designed to translate the time series traffic data from multiple locations over the road network into a desired format with spatial and temporal features), (b) detection methods and (c) localisation. This methodological framework is subsequently tested for early RC detection and NRC detection. Three translation layers including connectivity matrix, geographical grid translation and spatial temporal translation are presented and evaluated for both RC and NRC detection. The early RC detection approach is a deep learning based method that combines Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM). The NRC detection, on the other hand, involves only the application of the CNN. The performance of the proposed approach is compared against other conventional congestion detection methods, using a comprehensive evaluation framework that includes metrics such as detection rates and false positive rates, and the sensitivity analysis of time windows as well as prediction horizons. The conventional congestion detection methods used for the comparison include Multilayer Perceptron, Random Forest and Gradient Boost Classifier, all of which are commonly used in the literature. Real-world traffic data from the City of Bath are used for the comparative analysis of RC, while traffic data in conjunction with incident data extracted from Central London are used for NRC detection. The results show that while the connectivity matrix may be capable of extracting features of a small network, the increased sparsity in the matrix in a large network reduces its effectiveness in feature learning compared to geographical grid translation. The results also indicate that the proposed deep learning method demonstrates superior detection accuracy compared to alternative methods and that it can detect recurrent congestion as early as one hour ahead with acceptable accuracy. The proposed method is capable of being implemented within a real-world ITS system making use of traffic sensor data, thereby providing a practically useful tool for road network managers to manage traffic proactively. In addition, the results demonstrate that a deep learning-based approach may improve the accuracy of incident detection and locate traffic anomalies precisely, especially in a large urban network. Finally, the framework is further tested for robustness in terms of network topology, sensor faults and missing data. The robustness analysis demonstrates that the proposed traffic anomaly detection approaches are transferable to different sizes of road networks, and that they are robust in the presence of sensor faults and missing data.Open Acces

Spiral - Imperial College Digital Repository

Personality Identification from Social Media Using Deep Learning: A Review

Author: Anitha S. Pillai
Giuliana Guazzaroni
S. Bhavya
Publication venue
Publication date: 28/11/2019
Field of study

Social media helps in sharing of ideas and information among people scattered around the world and thus helps in creating communities, groups, and virtual networks. Identification of personality is significant in many types of applications such as in detecting the mental state or character of a person, predicting job satisfaction, professional and personal relationship success, in recommendation systems. Personality is also an important factor to determine individual variation in thoughts, feelings, and conduct systems. According to the survey of Global social media research in 2018, approximately 3.196 billion social media users are in worldwide. The numbers are estimated to grow rapidly further with the use of mobile smart devices and advancement in technology. Support vector machine (SVM), Naive Bayes (NB), Multilayer perceptron neural network, and convolutional neural network (CNN) are some of the machine learning techniques used for personality identification in the literature review. This paper presents various studies conducted in identifying the personality of social media users with the help of machine learning approaches and the recent studies that targeted to predict the personality of online social media (OSM) users are reviewed

Crossref

Open Access Repository