21,984 research outputs found
StackInsights: Cognitive Learning for Hybrid Cloud Readiness
Hybrid cloud is an integrated cloud computing environment utilizing a mix of
public cloud, private cloud, and on-premise traditional IT infrastructures.
Workload awareness, defined as a detailed full range understanding of each
individual workload, is essential in implementing the hybrid cloud. While it is
critical to perform an accurate analysis to determine which workloads are
appropriate for on-premise deployment versus which workloads can be migrated to
a cloud off-premise, the assessment is mainly performed by rule or policy based
approaches. In this paper, we introduce StackInsights, a novel cognitive system
to automatically analyze and predict the cloud readiness of workloads for an
enterprise. Our system harnesses the critical metrics across the entire stack:
1) infrastructure metrics, 2) data relevance metrics, and 3) application
taxonomy, to identify workloads that have characteristics of a) low sensitivity
with respect to business security, criticality and compliance, and b) low
response time requirements and access patterns. Since the capture of the data
relevance metrics involves an intrusive and in-depth scanning of the content of
storage objects, a machine learning model is applied to perform the business
relevance classification by learning from the meta level metrics harnessed
across stack. In contrast to traditional methods, StackInsights significantly
reduces the total time for hybrid cloud readiness assessment by orders of
magnitude
Analysis and evaluation of SafeDroid v2.0, a framework for detecting malicious Android applications
Android smartphones have become a vital component of the daily routine of millions of people, running a plethora of applications available in the official and alternative marketplaces. Although there are many security mechanisms to scan and filter malicious applications, malware is still able to reach the devices of many end-users. In this paper, we introduce the SafeDroid v2.0 framework, that is a flexible, robust, and versatile open-source solution for statically analysing Android applications, based on machine learning techniques. The main goal of our work, besides the automated production of fully sufficient prediction and classification models in terms of maximum accuracy scores and minimum negative errors, is to offer an out-of-the-box framework that can be employed by the Android security researchers to efficiently experiment to find effective solutions: the SafeDroid v2.0 framework makes it possible to test many different combinations of machine learning classifiers, with a high degree of freedom and flexibility in the choice of features to consider, such as dataset balance and dataset selection. The framework also provides a server, for generating experiment reports, and an Android application, for the verification of the produced models in real-life scenarios. An extensive campaign of experiments is also presented to show how it is possible to efficiently find competitive solutions: the results of our experiments confirm that SafeDroid v2.0 can reach very good performances, even with highly unbalanced dataset inputs and always with a very limited overhead
Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning
Fine tuning distributed systems is considered to be a craftsmanship, relying
on intuition and experience. This becomes even more challenging when the
systems need to react in near real time, as streaming engines have to do to
maintain pre-agreed service quality metrics. In this article, we present an
automated approach that builds on a combination of supervised and reinforcement
learning methods to recommend the most appropriate lever configurations based
on previous load. With this, streaming engines can be automatically tuned
without requiring a human to determine the right way and proper time to deploy
them. This opens the door to new configurations that are not being applied
today since the complexity of managing these systems has surpassed the
abilities of human experts. We show how reinforcement learning systems can find
substantially better configurations in less time than their human counterparts
and adapt to changing workloads
Representation Learning for Clustering: A Statistical Framework
We address the problem of communicating domain knowledge from a user to the
designer of a clustering algorithm. We propose a protocol in which the user
provides a clustering of a relatively small random sample of a data set. The
algorithm designer then uses that sample to come up with a data representation
under which -means clustering results in a clustering (of the full data set)
that is aligned with the user's clustering. We provide a formal statistical
model for analyzing the sample complexity of learning a clustering
representation with this paradigm. We then introduce a notion of capacity of a
class of possible representations, in the spirit of the VC-dimension, showing
that classes of representations that have finite such dimension can be
successfully learned with sample size error bounds, and end our discussion with
an analysis of that dimension for classes of representations induced by linear
embeddings.Comment: To be published in Proceedings of UAI 201
- …