2,677 research outputs found
GUIDE FOR THE COLLECTION OF INSTRUSION DATA FOR MALWARE ANALYSIS AND DETECTION IN THE BUILD AND DEPLOYMENT PHASE
During the COVID-19 pandemic, when most businesses were not equipped for remote work and cloud computing, we saw a significant surge in ransomware attacks. This study aims to utilize machine learning and artificial intelligence to prevent known and unknown malware threats from being exploited by threat actors when developers build and deploy applications to the cloud. This study demonstrated an experimental quantitative research design using Aqua. The experiment\u27s sample is a Docker image. Aqua checked the Docker image for malware, sensitive data, Critical/High vulnerabilities, misconfiguration, and OSS license. The data collection approach is experimental. Our analysis of the experiment demonstrated how unapproved images were prevented from running anywhere in our environment based on known vulnerabilities, embedded secrets, OSS licensing, dynamic threat analysis, and secure image configuration. In addition to the experiment, the forensic data collected in the build and deployment phase are exploitable vulnerability, Critical/High Vulnerability Score, Misconfiguration, Sensitive Data, and Root User (Super User). Since Aqua generates a detailed audit record for every event during risk assessment and runtime, we viewed two events on the Audit page for our experiment. One of the events caused an alert due to two failed controls (Vulnerability Score, Super User), and the other was a successful event meaning that the image is secure to deploy in the production environment. The primary finding for our study is the forensic data associated with the two events on the Audit page in Aqua. In addition, Aqua validated our security controls and runtime policies based on the forensic data with both events on the Audit page. Finally, the study’s conclusions will mitigate the likelihood that organizations will fall victim to ransomware by mitigating and preventing the total damage caused by a malware attack
Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective
Machine Learning models are being deployed as parts of real-world systems
with the upsurge of interest in artificial intelligence. The design,
implementation, and maintenance of such systems are challenged by real-world
environments that produce larger amounts of heterogeneous data and users
requiring increasingly faster responses with efficient resource consumption.
These requirements push prevalent software architectures to the limit when
deploying ML-based systems. Data-oriented Architecture (DOA) is an emerging
concept that equips systems better for integrating ML models. DOA extends
current architectures to create data-driven, loosely coupled, decentralised,
open systems. Even though papers on deployed ML-based systems do not mention
DOA, their authors made design decisions that implicitly follow DOA. The
reasons why, how, and the extent to which DOA is adopted in these systems are
unclear. Implicit design decisions limit the practitioners' knowledge of DOA to
design ML-based systems in the real world. This paper answers these questions
by surveying real-world deployments of ML-based systems. The survey shows the
design decisions of the systems and the requirements these satisfy. Based on
the survey findings, we also formulate practical advice to facilitate the
deployment of ML-based systems. Finally, we outline open challenges to
deploying DOA-based systems that integrate ML models.Comment: Under revie
A deep learning method for automatic SMS spam classification: Performance of learning algorithms on indigenous dataset
SMS, one of the most popular and fast-growing GSM value-added services worldwide, has attracted unwanted SMS, also known as SMS spam. The effects of SMS spam are significant as it affects both the users and the service providers, causing a massive gap in trust among both parties. This article presents a deep learning model based on BiLSTM. Further, it compares our results with some of the states of the art machine learning (ML) algorithm on two datasets: our newly collected dataset and the popular UCI SMS dataset. This study aims to evaluate the performance of diverse learning models and compare the result of the new dataset expanded (ExAIS_SMS) using the following metrics the true positive (TP), false positive (FP), F-measure, recall, precision, and overall accuracy. The average accuracy for the BiLSTSM model achieved moderately improved results compared to some of the ML classifiers. The experimental results achieved significant improvement from the ground truth results after effective fine-tuning of some of the parameters. The BiLSTM model using the ExAIS_SMS dataset attained an accuracy of 93.4% and 98.6% for UCI datasets. Further comparison of the two datasets on the state-of-the-art ML classifiers gave an accuracy of Naive Bayes, BayesNet, SOM, decision tree, C4.5, J48 is 89.64%, 91.11%, 88.24%, 75.76%, 80.24%, and 79.2% respectively for ExAIS_SMS datasets. In conclusion, our proposed BiLSTM model showed significant improvement over traditional ML classifiers. To further validate the robustness of our model, we applied the UCI datasets, and our results showed optimal performance while classifying SMS spam messages based on some metrics: accuracy, precision, recall, and F-measure.publishedVersio
Deep neural networks in the cloud: Review, applications, challenges and research directions
Deep neural networks (DNNs) are currently being deployed as machine learning technology in a wide
range of important real-world applications. DNNs consist of a huge number of parameters that require
millions of floating-point operations (FLOPs) to be executed both in learning and prediction modes. A
more effective method is to implement DNNs in a cloud computing system equipped with centralized
servers and data storage sub-systems with high-speed and high-performance computing capabilities.
This paper presents an up-to-date survey on current state-of-the-art deployed DNNs for cloud computing.
Various DNN complexities associated with different architectures are presented and discussed alongside
the necessities of using cloud computing. We also present an extensive overview of different cloud
computing platforms for the deployment of DNNs and discuss them in detail. Moreover, DNN applications
already deployed in cloud computing systems are reviewed to demonstrate the advantages of using
cloud computing for DNNs. The paper emphasizes the challenges of deploying DNNs in cloud computing
systems and provides guidance on enhancing current and new deployments.The EGIA project (KK-2022/00119The
Consolidated Research Group MATHMODE (IT1456-22
Project Florida: Federated Learning Made Easy
We present Project Florida, a system architecture and software development
kit (SDK) enabling deployment of large-scale Federated Learning (FL) solutions
across a heterogeneous device ecosystem. Federated learning is an approach to
machine learning based on a strong data sovereignty principle, i.e., that
privacy and security of data is best enabled by storing it at its origin,
whether on end-user devices or in segregated cloud storage silos. Federated
learning enables model training across devices and silos while the training
data remains within its security boundary, by distributing a model snapshot to
a client running inside the boundary, running client code to update the model,
and then aggregating updated snapshots across many clients in a central
orchestrator. Deploying a FL solution requires implementation of complex
privacy and security mechanisms as well as scalable orchestration
infrastructure. Scale and performance is a paramount concern, as the model
training process benefits from full participation of many client devices, which
may have a wide variety of performance characteristics. Project Florida aims to
simplify the task of deploying cross-device FL solutions by providing
cloud-hosted infrastructure and accompanying task management interfaces, as
well as a multi-platform SDK supporting most major programming languages
including C++, Java, and Python, enabling FL training across a wide range of
operating system (OS) and hardware specifications. The architecture decouples
service management from the FL workflow, enabling a cloud service provider to
deliver FL-as-a-service (FLaaS) to ML engineers and application developers. We
present an overview of Florida, including a description of the architecture,
sample code, and illustrative experiments demonstrating system capabilities
Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks
Future wireless networks have a substantial potential in terms of supporting
a broad range of complex compelling applications both in military and civilian
fields, where the users are able to enjoy high-rate, low-latency, low-cost and
reliable information services. Achieving this ambitious goal requires new radio
techniques for adaptive learning and intelligent decision making because of the
complex heterogeneous nature of the network structures and wireless services.
Machine learning (ML) algorithms have great success in supporting big data
analytics, efficient parameter estimation and interactive decision making.
Hence, in this article, we review the thirty-year history of ML by elaborating
on supervised learning, unsupervised learning, reinforcement learning and deep
learning. Furthermore, we investigate their employment in the compelling
applications of wireless networks, including heterogeneous networks (HetNets),
cognitive radios (CR), Internet of things (IoT), machine to machine networks
(M2M), and so on. This article aims for assisting the readers in clarifying the
motivation and methodology of the various ML algorithms, so as to invoke them
for hitherto unexplored services as well as scenarios of future wireless
networks.Comment: 46 pages, 22 fig
Smart Buildings
This talk presents an efficient cyberphysical platform for the smart management of smart buildings http://www.deepint.net. It is efficient because it facilitates the implementation of data acquisition and data management methods, as well as data representation and dashboard configuration. The platform allows for the use of any type of data source, ranging from the measurements of a multi-functional IoT sensing devices to relational and non-relational databases. It is also smart because it incorporates a complete artificial intelligence suit for data analysis; it includes techniques for data classification, clustering, forecasting, optimization, visualization, etc. It is also compatible with the edge computing concept, allowing for the distribution of intelligence and the use of intelligent sensors. The concept of smart building is evolving and adapting to new applications; the trend to create intelligent neighbourhoods, districts or territories is becoming increasingly popular, as opposed to the previous approach of managing an entire megacity. In this paper, the platform is presented, and its architecture and functionalities are described. Moreover, its operation has been validated in a case study at Salamanca - Ecocasa. This platform could enable smart building to develop adapted knowledge management systems, adapt them to new requirements and to use multiple types of data, and execute efficient computational and artificial intelligence algorithms. The platform optimizes the decisions taken by human experts through explainable artificial intelligence models that obtain data from IoT sensors, databases, the Internet, etc. The global intelligence of the platform could potentially coordinate its decision-making processes with intelligent nodes installed in the edge, which would use the most advanced data processing techniques
- …