Search CORE

2,114 research outputs found

Recommended from our members

MapReduce based RDF assisted distributed SVM for high throughput spam filtering

Author: Caruana Godwin
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses. Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart. Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure. The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin

Brunel University Research Archive

Legal approaches of the healthgrid technology

Author: Herveg Jean
Poullet Yves
Publication venue: s.n.
Publication date: 01/01/2004
Field of study

Federated Learning for Medical Applications: A Taxonomy, Current Trends, Challenges, and Future Research Directions

Author: Bagci Ulas
Hagos Desta Haileselassie
Håkegård Jan Erik
Jha Debesh
Rauniyar Ashish
Rawat Danda B.
Vlassov Vladimir
Publication venue
Publication date: 29/10/2023
Field of study

With the advent of the IoT, AI, ML, and DL algorithms, the landscape of data-driven medical applications has emerged as a promising avenue for designing robust and scalable diagnostic and prognostic models from medical data. This has gained a lot of attention from both academia and industry, leading to significant improvements in healthcare quality. However, the adoption of AI-driven medical applications still faces tough challenges, including meeting security, privacy, and quality of service (QoS) standards. Recent developments in \ac{FL} have made it possible to train complex machine-learned models in a distributed manner and have become an active research domain, particularly processing the medical data at the edge of the network in a decentralized way to preserve privacy and address security concerns. To this end, in this paper, we explore the present and future of FL technology in medical applications where data sharing is a significant challenge. We delve into the current research trends and their outcomes, unravelling the complexities of designing reliable and scalable \ac{FL} models. Our paper outlines the fundamental statistical issues in FL, tackles device-related problems, addresses security challenges, and navigates the complexity of privacy concerns, all while highlighting its transformative potential in the medical field. Our study primarily focuses on medical applications of \ac{FL}, particularly in the context of global cancer diagnosis. We highlight the potential of FL to enable computer-aided diagnosis tools that address this challenge with greater effectiveness than traditional data-driven methods. We hope that this comprehensive review will serve as a checkpoint for the field, summarizing the current state-of-the-art and identifying open problems and future research directions.Comment: Accepted at IEEE Internet of Things Journa

arXiv.org e-Print Archive

Security and blockchain convergence with internet of multimedia things : current trends, research challenges and future directions

Author: Alazab Mamoun
Cai Jinjin
Gao Xiang-Chuan
Jan Mian
Khan Fazlullah
Mastorakis Spyridon
Usman Muhammad
Watters Paul
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

The Internet of Multimedia Things (IoMT) orchestration enables the integration of systems, software, cloud, and smart sensors into a single platform. The IoMT deals with scalar as well as multimedia data. In these networks, sensor-embedded devices and their data face numerous challenges when it comes to security. In this paper, a comprehensive review of the existing literature for IoMT is presented in the context of security and blockchain. The latest literature on all three aspects of security, i.e., authentication, privacy, and trust is provided to explore the challenges experienced by multimedia data. The convergence of blockchain and IoMT along with multimedia-enabled blockchain platforms are discussed for emerging applications. To highlight the significance of this survey, large-scale commercial projects focused on security and blockchain for multimedia applications are reviewed. The shortcomings of these projects are explored and suggestions for further improvement are provided. Based on the aforementioned discussion, we present our own case study for healthcare industry: a theoretical framework having security and blockchain as key enablers. The case study reflects the importance of security and blockchain in multimedia applications of healthcare sector. Finally, we discuss the convergence of emerging technologies with security, blockchain and IoMT to visualize the future of tomorrow's applications. © 2020 Elsevier Lt

Federation ResearchOnline

A Survey on Enterprise Network Security: Asset Behavioral Monitoring and Distributed Attack Detection

Author: Gharakheili Hassan Habibi
Lyu Minzhao
Sivaraman Vijay
Publication venue
Publication date: 29/06/2023
Field of study

Enterprise networks that host valuable assets and services are popular and frequent targets of distributed network attacks. In order to cope with the ever-increasing threats, industrial and research communities develop systems and methods to monitor the behaviors of their assets and protect them from critical attacks. In this paper, we systematically survey related research articles and industrial systems to highlight the current status of this arms race in enterprise network security. First, we discuss the taxonomy of distributed network attacks on enterprise assets, including distributed denial-of-service (DDoS) and reconnaissance attacks. Second, we review existing methods in monitoring and classifying network behavior of enterprise hosts to verify their benign activities and isolate potential anomalies. Third, state-of-the-art detection methods for distributed network attacks sourced from external attackers are elaborated, highlighting their merits and bottlenecks. Fourth, as programmable networks and machine learning (ML) techniques are increasingly becoming adopted by the community, their current applications in network security are discussed. Finally, we highlight several research gaps on enterprise network security to inspire future research.Comment: Journal paper submitted to Elseive

arXiv.org e-Print Archive

Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation

Author: Dhakal Sagar
Hayat Majeed M.
Pezoa Jorge E.
Publication venue: e-Publications@Marquette
Publication date: 01/10/2010
Field of study

In distributed computing systems (DCSs) where server nodes can fail permanently with nonzero probability, the system performance can be assessed by means of the service reliability, defined as the probability of serving all the tasks queued in the DCS before all the nodes fail. This paper presents a rigorous probabilistic framework to analytically characterize the service reliability of a DCS in the presence of communication uncertainties and stochastic topological changes due to node deletions. The framework considers a system composed of heterogeneous nodes with stochastic service and failure times and a communication network imposing random tangible delays. The framework also permits arbitrarily specified, distributed load-balancing actions to be taken by the individual nodes in order to improve the service reliability. The presented analysis is based upon a novel use of the concept of stochastic regeneration, which is exploited to derive a system of difference-differential equations characterizing the service reliability. The theory is further utilized to optimize certain load-balancing policies for maximal service reliability; the optimization is carried out by means of an algorithm that scales linearly with the number of nodes in the system. The analytical model is validated using both Monte Carlo simulations and experimental data collected from a DCS testbed