2,114 research outputs found
Recommended from our members
MapReduce based RDF assisted distributed SVM for high throughput spam filtering
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses.
Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart.
Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure.
The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin
Federated Learning for Medical Applications: A Taxonomy, Current Trends, Challenges, and Future Research Directions
With the advent of the IoT, AI, ML, and DL algorithms, the landscape of
data-driven medical applications has emerged as a promising avenue for
designing robust and scalable diagnostic and prognostic models from medical
data. This has gained a lot of attention from both academia and industry,
leading to significant improvements in healthcare quality. However, the
adoption of AI-driven medical applications still faces tough challenges,
including meeting security, privacy, and quality of service (QoS) standards.
Recent developments in \ac{FL} have made it possible to train complex
machine-learned models in a distributed manner and have become an active
research domain, particularly processing the medical data at the edge of the
network in a decentralized way to preserve privacy and address security
concerns. To this end, in this paper, we explore the present and future of FL
technology in medical applications where data sharing is a significant
challenge. We delve into the current research trends and their outcomes,
unravelling the complexities of designing reliable and scalable \ac{FL} models.
Our paper outlines the fundamental statistical issues in FL, tackles
device-related problems, addresses security challenges, and navigates the
complexity of privacy concerns, all while highlighting its transformative
potential in the medical field. Our study primarily focuses on medical
applications of \ac{FL}, particularly in the context of global cancer
diagnosis. We highlight the potential of FL to enable computer-aided diagnosis
tools that address this challenge with greater effectiveness than traditional
data-driven methods. We hope that this comprehensive review will serve as a
checkpoint for the field, summarizing the current state-of-the-art and
identifying open problems and future research directions.Comment: Accepted at IEEE Internet of Things Journa
Security and blockchain convergence with internet of multimedia things : current trends, research challenges and future directions
The Internet of Multimedia Things (IoMT) orchestration enables the integration of systems, software, cloud, and smart sensors into a single platform. The IoMT deals with scalar as well as multimedia data. In these networks, sensor-embedded devices and their data face numerous challenges when it comes to security. In this paper, a comprehensive review of the existing literature for IoMT is presented in the context of security and blockchain. The latest literature on all three aspects of security, i.e., authentication, privacy, and trust is provided to explore the challenges experienced by multimedia data. The convergence of blockchain and IoMT along with multimedia-enabled blockchain platforms are discussed for emerging applications. To highlight the significance of this survey, large-scale commercial projects focused on security and blockchain for multimedia applications are reviewed. The shortcomings of these projects are explored and suggestions for further improvement are provided. Based on the aforementioned discussion, we present our own case study for healthcare industry: a theoretical framework having security and blockchain as key enablers. The case study reflects the importance of security and blockchain in multimedia applications of healthcare sector. Finally, we discuss the convergence of emerging technologies with security, blockchain and IoMT to visualize the future of tomorrow's applications. © 2020 Elsevier Lt
A Survey on Enterprise Network Security: Asset Behavioral Monitoring and Distributed Attack Detection
Enterprise networks that host valuable assets and services are popular and
frequent targets of distributed network attacks. In order to cope with the
ever-increasing threats, industrial and research communities develop systems
and methods to monitor the behaviors of their assets and protect them from
critical attacks. In this paper, we systematically survey related research
articles and industrial systems to highlight the current status of this arms
race in enterprise network security. First, we discuss the taxonomy of
distributed network attacks on enterprise assets, including distributed
denial-of-service (DDoS) and reconnaissance attacks. Second, we review existing
methods in monitoring and classifying network behavior of enterprise hosts to
verify their benign activities and isolate potential anomalies. Third,
state-of-the-art detection methods for distributed network attacks sourced from
external attackers are elaborated, highlighting their merits and bottlenecks.
Fourth, as programmable networks and machine learning (ML) techniques are
increasingly becoming adopted by the community, their current applications in
network security are discussed. Finally, we highlight several research gaps on
enterprise network security to inspire future research.Comment: Journal paper submitted to Elseive
Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation
In distributed computing systems (DCSs) where server nodes can fail permanently with nonzero probability, the system performance can be assessed by means of the service reliability, defined as the probability of serving all the tasks queued in the DCS before all the nodes fail. This paper presents a rigorous probabilistic framework to analytically characterize the service reliability of a DCS in the presence of communication uncertainties and stochastic topological changes due to node deletions. The framework considers a system composed of heterogeneous nodes with stochastic service and failure times and a communication network imposing random tangible delays. The framework also permits arbitrarily specified, distributed load-balancing actions to be taken by the individual nodes in order to improve the service reliability. The presented analysis is based upon a novel use of the concept of stochastic regeneration, which is exploited to derive a system of difference-differential equations characterizing the service reliability. The theory is further utilized to optimize certain load-balancing policies for maximal service reliability; the optimization is carried out by means of an algorithm that scales linearly with the number of nodes in the system. The analytical model is validated using both Monte Carlo simulations and experimental data collected from a DCS testbed
- …