Search CORE

2,677 research outputs found

GUIDE FOR THE COLLECTION OF INSTRUSION DATA FOR MALWARE ANALYSIS AND DETECTION IN THE BUILD AND DEPLOYMENT PHASE

Author: Gassama Musa
Publication venue: The Repository at St. Cloud State
Publication date: 01/12/2022
Field of study

During the COVID-19 pandemic, when most businesses were not equipped for remote work and cloud computing, we saw a significant surge in ransomware attacks. This study aims to utilize machine learning and artificial intelligence to prevent known and unknown malware threats from being exploited by threat actors when developers build and deploy applications to the cloud. This study demonstrated an experimental quantitative research design using Aqua. The experiment\u27s sample is a Docker image. Aqua checked the Docker image for malware, sensitive data, Critical/High vulnerabilities, misconfiguration, and OSS license. The data collection approach is experimental. Our analysis of the experiment demonstrated how unapproved images were prevented from running anywhere in our environment based on known vulnerabilities, embedded secrets, OSS licensing, dynamic threat analysis, and secure image configuration. In addition to the experiment, the forensic data collected in the build and deployment phase are exploitable vulnerability, Critical/High Vulnerability Score, Misconfiguration, Sensitive Data, and Root User (Super User). Since Aqua generates a detailed audit record for every event during risk assessment and runtime, we viewed two events on the Audit page for our experiment. One of the events caused an alert due to two failed controls (Vulnerability Score, Super User), and the other was a successful event meaning that the image is secure to deploy in the production environment. The primary finding for our study is the forensic data associated with the two events on the Audit page in Aqua. In addition, Aqua validated our security controls and runtime policies based on the forensic data with both events on the Audit page. Finally, the study’s conclusions will mitigate the likelihood that organizations will fall victim to ransomware by mitigating and preventing the total damage caused by a malware attack

St. Cloud State University

Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective

Author: Cabrera Christian
Lawrence Neil D.
Paleyes Andrei
Thodoroff Pierre
Publication venue
Publication date: 09/10/2023
Field of study

Machine Learning models are being deployed as parts of real-world systems with the upsurge of interest in artificial intelligence. The design, implementation, and maintenance of such systems are challenged by real-world environments that produce larger amounts of heterogeneous data and users requiring increasingly faster responses with efficient resource consumption. These requirements push prevalent software architectures to the limit when deploying ML-based systems. Data-oriented Architecture (DOA) is an emerging concept that equips systems better for integrating ML models. DOA extends current architectures to create data-driven, loosely coupled, decentralised, open systems. Even though papers on deployed ML-based systems do not mention DOA, their authors made design decisions that implicitly follow DOA. The reasons why, how, and the extent to which DOA is adopted in these systems are unclear. Implicit design decisions limit the practitioners' knowledge of DOA to design ML-based systems in the real world. This paper answers these questions by surveying real-world deployments of ML-based systems. The survey shows the design decisions of the systems and the requirements these satisfy. Based on the survey findings, we also formulate practical advice to facilitate the deployment of ML-based systems. Finally, we outline open challenges to deploying DOA-based systems that integrate ML models.Comment: Under revie

arXiv.org e-Print Archive

A deep learning method for automatic SMS spam classification: Performance of learning algorithms on indigenous dataset

Author: Abayomi-Alli Adebayo
Abayomi-Alli Olusola
Misra Sanjay
Publication venue: 'Wiley'
Publication date: 01/01/2022
Field of study

SMS, one of the most popular and fast-growing GSM value-added services worldwide, has attracted unwanted SMS, also known as SMS spam. The effects of SMS spam are significant as it affects both the users and the service providers, causing a massive gap in trust among both parties. This article presents a deep learning model based on BiLSTM. Further, it compares our results with some of the states of the art machine learning (ML) algorithm on two datasets: our newly collected dataset and the popular UCI SMS dataset. This study aims to evaluate the performance of diverse learning models and compare the result of the new dataset expanded (ExAIS_SMS) using the following metrics the true positive (TP), false positive (FP), F-measure, recall, precision, and overall accuracy. The average accuracy for the BiLSTSM model achieved moderately improved results compared to some of the ML classifiers. The experimental results achieved significant improvement from the ground truth results after effective fine-tuning of some of the parameters. The BiLSTM model using the ExAIS_SMS dataset attained an accuracy of 93.4% and 98.6% for UCI datasets. Further comparison of the two datasets on the state-of-the-art ML classifiers gave an accuracy of Naive Bayes, BayesNet, SOM, decision tree, C4.5, J48 is 89.64%, 91.11%, 88.24%, 75.76%, 80.24%, and 79.2% respectively for ExAIS_SMS datasets. In conclusion, our proposed BiLSTM model showed significant improvement over traditional ML classifiers. To further validate the robustness of our model, we applied the UCI datasets, and our results showed optimal performance while classifying SMS spam messages based on some metrics: accuracy, precision, recall, and F-measure.publishedVersio

HIØ Brage

NORA - Norwegian Open Research Archives

Deep neural networks in the cloud: Review, applications, challenges and research directions

Author: Al-Zoubi Ala´ M.
Yan Chan Kit
Publication venue: Elsevier
Publication date: 13/05/2023
Field of study

Deep neural networks (DNNs) are currently being deployed as machine learning technology in a wide range of important real-world applications. DNNs consist of a huge number of parameters that require millions of floating-point operations (FLOPs) to be executed both in learning and prediction modes. A more effective method is to implement DNNs in a cloud computing system equipped with centralized servers and data storage sub-systems with high-speed and high-performance computing capabilities. This paper presents an up-to-date survey on current state-of-the-art deployed DNNs for cloud computing. Various DNN complexities associated with different architectures are presented and discussed alongside the necessities of using cloud computing. We also present an extensive overview of different cloud computing platforms for the deployment of DNNs and discuss them in detail. Moreover, DNN applications already deployed in cloud computing systems are reviewed to demonstrate the advantages of using cloud computing for DNNs. The paper emphasizes the challenges of deploying DNNs in cloud computing systems and provides guidance on enhancing current and new deployments.The EGIA project (KK-2022/00119The Consolidated Research Group MATHMODE (IT1456-22

Repositorio Institucional Universidad de Granada

Project Florida: Federated Learning Made Easy

Author: Chen Jialei
Diaz Daniel Madrigal
Manoel Andre
Sim Robert
Singal Nalin
Publication venue
Publication date: 21/07/2023
Field of study

We present Project Florida, a system architecture and software development kit (SDK) enabling deployment of large-scale Federated Learning (FL) solutions across a heterogeneous device ecosystem. Federated learning is an approach to machine learning based on a strong data sovereignty principle, i.e., that privacy and security of data is best enabled by storing it at its origin, whether on end-user devices or in segregated cloud storage silos. Federated learning enables model training across devices and silos while the training data remains within its security boundary, by distributing a model snapshot to a client running inside the boundary, running client code to update the model, and then aggregating updated snapshots across many clients in a central orchestrator. Deploying a FL solution requires implementation of complex privacy and security mechanisms as well as scalable orchestration infrastructure. Scale and performance is a paramount concern, as the model training process benefits from full participation of many client devices, which may have a wide variety of performance characteristics. Project Florida aims to simplify the task of deploying cross-device FL solutions by providing cloud-hosted infrastructure and accompanying task management interfaces, as well as a multi-platform SDK supporting most major programming languages including C++, Java, and Python, enabling FL training across a wide range of operating system (OS) and hardware specifications. The architecture decouples service management from the FL workflow, enabling a cloud service provider to deliver FL-as-a-service (FLaaS) to ML engineers and application developers. We present an overview of Florida, including a description of the architecture, sample code, and illustrative experiments demonstrating system capabilities

arXiv.org e-Print Archive

Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

Author: Chen Kwang-Cheng
Hanzo Lajos
Jiang Chunxiao
Ren Yong
Wang Jingjing
Zhang Haijun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/01/2019
Field of study

Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Smart Buildings

Author: Corchado Rodríguez Juan Manuel
Publication venue: Smart CRD
Publication date: 20/05/2021
Field of study

This talk presents an efficient cyberphysical platform for the smart management of smart buildings http://www.deepint.net. It is efficient because it facilitates the implementation of data acquisition and data management methods, as well as data representation and dashboard configuration. The platform allows for the use of any type of data source, ranging from the measurements of a multi-functional IoT sensing devices to relational and non-relational databases. It is also smart because it incorporates a complete artificial intelligence suit for data analysis; it includes techniques for data classification, clustering, forecasting, optimization, visualization, etc. It is also compatible with the edge computing concept, allowing for the distribution of intelligence and the use of intelligent sensors. The concept of smart building is evolving and adapting to new applications; the trend to create intelligent neighbourhoods, districts or territories is becoming increasingly popular, as opposed to the previous approach of managing an entire megacity. In this paper, the platform is presented, and its architecture and functionalities are described. Moreover, its operation has been validated in a case study at Salamanca - Ecocasa. This platform could enable smart building to develop adapted knowledge management systems, adapt them to new requirements and to use multiple types of data, and execute efficient computational and artificial intelligence algorithms. The platform optimizes the decisions taken by human experts through explainable artificial intelligence models that obtain data from IoT sensors, databases, the Internet, etc. The global intelligence of the platform could potentially coordinate its decision-making processes with intelligent nodes installed in the edge, which would use the most advanced data processing techniques

Gestion del Repositorio Documental de la Universidad de Salamanca