24 research outputs found

    A Blockchain-based Decentralized Electronic Marketplace for Computing Resources

    Get PDF
    AbstractWe propose a framework for building a decentralized electronic marketplace for computing resources. The idea is that anyone with spare capacities can offer them on this marketplace, opening up the cloud computing market to smaller players, thus creating a more competitive environment compared to today's market consisting of a few large providers. Trust is a crucial component in making an anonymized decentralized marketplace a reality. We develop protocols that enable participants to interact with each other in a fair way and show how these protocols can be implemented using smart contracts and blockchains. We discuss and evaluate our framework not only from a technical point of view, but also look at the wider context in terms of fair interactions and legal implications

    Learning workload behaviour models from monitored time-series for resource estimation towards data center optimization

    Get PDF
    In recent years there has been an extraordinary growth of the demand of Cloud Computing resources executed in Data Centers. Modern Data Centers are complex systems that need management. As distributed computing systems grow, and workloads benefit from such computing environments, the management of such systems increases in complexity. The complexity of resource usage and power consumption on cloud-based applications makes the understanding of application behavior through expert examination difficult. The difficulty increases when applications are seen as "black boxes", where only external monitoring can be retrieved. Furthermore, given the different amount of scenarios and applications, automation is required. To deal with such complexity, Machine Learning methods become crucial to facilitate tasks that can be automatically learned from data. Firstly, this thesis proposes an unsupervised learning technique to learn high level representations from workload traces. Such technique provides a fast methodology to characterize workloads as sequences of abstract phases. The learned phase representation is validated on a variety of datasets and used in an auto-scaling task where we show that it can be applied in a production environment, achieving better performance than other state-of-the-art techniques. Secondly, this thesis proposes a neural architecture, based on Sequence-to-Sequence models, that provides the expected resource usage of applications sharing hardware resources. The proposed technique provides resource managers the ability to predict resource usage over time as well as the completion time of the running applications. The technique provides lower error predicting usage when compared with other popular Machine Learning methods. Thirdly, this thesis proposes a technique for auto-tuning Big Data workloads from the available tunable parameters. The proposed technique gathers information from the logs of an application generating a feature descriptor that captures relevant information from the application to be tuned. Using this information we demonstrate that performance models can generalize up to a 34% better when compared with other state-of-the-art solutions. Moreover, the search time to find a suitable solution can be drastically reduced, with up to a 12x speedup and almost equal quality results as modern solutions. These results prove that modern learning algorithms, with the right feature information, provide powerful techniques to manage resource allocation for applications running in cloud environments. This thesis demonstrates that learning algorithms allow relevant optimizations in Data Center environments, where applications are externally monitored and careful resource management is paramount to efficiently use computing resources. We propose to demonstrate this thesis in three areas that orbit around resource management in server environmentsEls Centres de Dades (Data Centers) moderns són sistemes complexos que necessiten ser gestionats. A mesura que creixen els sistemes de computació distribuïda i les aplicacions es beneficien d’aquestes infraestructures, també n’augmenta la seva complexitat. La complexitat que implica gestionar recursos de còmput i d’energia en sistemes de computació al núvol fa difícil entendre el comportament de les aplicacions que s'executen de manera manual. Aquesta dificultat s’incrementa quan les aplicacions s'observen com a "caixes negres", on només es poden monitoritzar algunes mètriques de les caixes de manera externa. A més, degut a la gran varietat d’escenaris i aplicacions, és necessari automatitzar la gestió d'aquests recursos. Per afrontar-ne el repte, l'aprenentatge automàtic juga un paper cabdal que facilita aquestes tasques, que poden ser apreses automàticament en base a dades prèvies del sistema que es monitoritza. Aquesta tesi demostra que els algorismes d'aprenentatge poden aportar optimitzacions molt rellevants en la gestió de Centres de Dades, on les aplicacions són monitoritzades externament i la gestió dels recursos és de vital importància per a fer un ús eficient de la capacitat de còmput d'aquests sistemes. En primer lloc, aquesta tesi proposa emprar aprenentatge no supervisat per tal d’aprendre representacions d'alt nivell a partir de traces d'aplicacions. Aquesta tècnica ens proporciona una metodologia ràpida per a caracteritzar aplicacions vistes com a seqüències de fases abstractes. La representació apresa de fases és validada en diferents “datasets” i s'aplica a la gestió de tasques d'”auto-scaling”, on es conclou que pot ser aplicable en un medi de producció, aconseguint un millor rendiment que altres mètodes de vanguardia. En segon lloc, aquesta tesi proposa l'ús de xarxes neuronals, basades en arquitectures “Sequence-to-Sequence”, que proporcionen una estimació dels recursos usats per aplicacions que comparteixen recursos de hardware. La tècnica proposada facilita als gestors de recursos l’habilitat de predir l'ús de recursos a través del temps, així com també una estimació del temps de còmput de les aplicacions. Tanmateix, redueix l’error en l’estimació de recursos en comparació amb d’altres tècniques populars d'aprenentatge automàtic. Per acabar, aquesta tesi introdueix una tècnica per a fer “auto-tuning” dels “hyper-paràmetres” d'aplicacions de Big Data. Consisteix així en obtenir informació dels “logs” de les aplicacions, generant un vector de característiques que captura informació rellevant de les aplicacions que s'han de “tunejar”. Emprant doncs aquesta informació es valida que els ”Regresors” entrenats en la predicció del rendiment de les aplicacions són capaços de generalitzar fins a un 34% millor que d’altres “Regresors” de vanguàrdia. A més, el temps de cerca per a trobar una bona solució es pot reduir dràsticament, aconseguint un increment de millora de fins a 12 vegades més dels resultats de qualitat en contraposició a alternatives modernes. Aquests resultats posen de manifest que els algorismes moderns d'aprenentatge automàtic esdevenen tècniques molt potents per tal de gestionar l'assignació de recursos en aplicacions que s'executen al núvol.Arquitectura de computador

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Internet of Things From Hype to Reality

    Get PDF
    The Internet of Things (IoT) has gained significant mindshare, let alone attention, in academia and the industry especially over the past few years. The reasons behind this interest are the potential capabilities that IoT promises to offer. On the personal level, it paints a picture of a future world where all the things in our ambient environment are connected to the Internet and seamlessly communicate with each other to operate intelligently. The ultimate goal is to enable objects around us to efficiently sense our surroundings, inexpensively communicate, and ultimately create a better environment for us: one where everyday objects act based on what we need and like without explicit instructions

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Detection and Mitigation of Steganographic Malware

    Get PDF
    A new attack trend concerns the use of some form of steganography and information hiding to make malware stealthier and able to elude many standard security mechanisms. Therefore, this Thesis addresses the detection and the mitigation of this class of threats. In particular, it considers malware implementing covert communications within network traffic or cloaking malicious payloads within digital images. The first research contribution of this Thesis is in the detection of network covert channels. Unfortunately, the literature on the topic lacks of real traffic traces or attack samples to perform precise tests or security assessments. Thus, a propaedeutic research activity has been devoted to develop two ad-hoc tools. The first allows to create covert channels targeting the IPv6 protocol by eavesdropping flows, whereas the second allows to embed secret data within arbitrary traffic traces that can be replayed to perform investigations in realistic conditions. This Thesis then starts with a security assessment concerning the impact of hidden network communications in production-quality scenarios. Results have been obtained by considering channels cloaking data in the most popular protocols (e.g., TLS, IPv4/v6, and ICMPv4/v6) and showcased that de-facto standard intrusion detection systems and firewalls (i.e., Snort, Suricata, and Zeek) are unable to spot this class of hazards. Since malware can conceal information (e.g., commands and configuration files) in almost every protocol, traffic feature or network element, configuring or adapting pre-existent security solutions could be not straightforward. Moreover, inspecting multiple protocols, fields or conversations at the same time could lead to performance issues. Thus, a major effort has been devoted to develop a suite based on the extended Berkeley Packet Filter (eBPF) to gain visibility over different network protocols/components and to efficiently collect various performance indicators or statistics by using a unique technology. This part of research allowed to spot the presence of network covert channels targeting the header of the IPv6 protocol or the inter-packet time of generic network conversations. In addition, the approach based on eBPF turned out to be very flexible and also allowed to reveal hidden data transfers between two processes co-located within the same host. Another important contribution of this part of the Thesis concerns the deployment of the suite in realistic scenarios and its comparison with other similar tools. Specifically, a thorough performance evaluation demonstrated that eBPF can be used to inspect traffic and reveal the presence of covert communications also when in the presence of high loads, e.g., it can sustain rates up to 3 Gbit/s with commodity hardware. To further address the problem of revealing network covert channels in realistic environments, this Thesis also investigates malware targeting traffic generated by Internet of Things devices. In this case, an incremental ensemble of autoencoders has been considered to face the ''unknown'' location of the hidden data generated by a threat covertly exchanging commands towards a remote attacker. The second research contribution of this Thesis is in the detection of malicious payloads hidden within digital images. In fact, the majority of real-world malware exploits hiding methods based on Least Significant Bit steganography and some of its variants, such as the Invoke-PSImage mechanism. Therefore, a relevant amount of research has been done to detect the presence of hidden data and classify the payload (e.g., malicious PowerShell scripts or PHP fragments). To this aim, mechanisms leveraging Deep Neural Networks (DNNs) proved to be flexible and effective since they can learn by combining raw low-level data and can be updated or retrained to consider unseen payloads or images with different features. To take into account realistic threat models, this Thesis studies malware targeting different types of images (i.e., favicons and icons) and various payloads (e.g., URLs and Ethereum addresses, as well as webshells). Obtained results showcased that DNNs can be considered a valid tool for spotting the presence of hidden contents since their detection accuracy is always above 90% also when facing ''elusion'' mechanisms such as basic obfuscation techniques or alternative encoding schemes. Lastly, when detection or classification are not possible (e.g., due to resource constraints), approaches enforcing ''sanitization'' can be applied. Thus, this Thesis also considers autoencoders able to disrupt hidden malicious contents without degrading the quality of the image

    Personal Data Management in the Internet of Things

    Get PDF
    Due to a sharp decrease in hardware costs and shrinking form factors, networked sensors have become ubiquitous. Today, a variety of sensors are embedded into smartphones, tablets, and personal wearable devices, and are commonly installed in homes and buildings. Sensors are used to collect data about people in their proximity, referred to as users. The collection of such networked sensors is commonly referred to as the Internet of Things. Although sensor data enables a wide range of applications from security, to efficiency, to healthcare, this data can be used to reveal unwarranted private information about users. Thus it is imperative to preserve data privacy while providing users with a wide variety of applications to process their personal data. Unfortunately, most existing systems do not meet these goals. Users are either forced to release their data to third parties, such as application developers, thus giving up data privacy in exchange for using data-driven applications, or are limited to using a fixed set of applications, such as those provided by the sensor manufacturer. To avoid this trade-off, users may chose to host their data and applications on their personal devices, but this requires them to maintain data backups and ensure application performance. What is needed, therefore, is a system that gives users flexibility in their choice of data-driven applications while preserving their data privacy, without burdening users with the need to backup their data and providing computational resources for their applications. We propose a software architecture that leverages a user's personal virtual execution environment (VEE) to host data-driven applications. This dissertation describes key software techniques and mechanisms that are necessary to enable this architecture. First, we provide a proof-of-concept implementation of our proposed architecture and demonstrate a privacy-preserving ecosystem of applications that process users' energy data as a case study. Second, we present a data management system (called Bolt) that provides applications with efficient storage and retrieval of time-series data, and guarantees the confidentiality and integrity of stored data. We then present a methodology to provision large numbers of personal VEEs on a single physical machine, and demonstrate its use with LinuX Containers (LXC). We conclude by outlining the design of an abstract framework to allow users to balance data privacy and application utility

    Automated tools and techniques for distributed Grid Software: Development of the testbed infrastructure

    Get PDF
    Grid technology is becoming more and more important as the new paradigm for sharing computational resources across different organizations in a secure way. The great powerfulness of this solution, requires the definition of a generic stack of services and protocols and this is the scope of the different Grid initiatives. As a result of international collaborations for its development, the Open Grid Forum created the Open Grid Services Architecture (OGSA) which aims to define the common set of services that will enable interoperability across the different implementations. This master thesis has been developed in this framework, as part of the two European-funded projects ETICS and OMII-Europe. The main objective is to contribute to the design and maintenance of large distributed development projects with the automated tool that enables to implement Software Engineering techniques oriented to achieve an acceptable level of quality at the release process. Specifically, this thesis develops the testbed concept as the virtual production-like scenario where to perform compliance tests. As proof of concept, the OGSA Basic Execution Service has been chosen in order to implement and execute conformance tests within the ETICS automated testbed framework

    Konzepte für Datensicherheit und Datenschutz in mobilen Anwendungen

    Get PDF
    Smart Devices und insbesondere Smartphones nehmen eine immer wichtigere Rolle in unserem Leben ein. Aufgrund einer kontinuierlich anwachsenden Akkulaufzeit können diese Geräte nahezu ununterbrochen mitgeführt und genutzt werden. Zusätzlich sorgen stetig günstiger werdende Mobilfunktarife und ansteigende Datenraten dafür, dass den Nutzern mit diesen Geräten eine immerwährende Verbindung zum Internet zur Verfügung steht. Smart Devices sind dadurch nicht mehr reine Kommunikationsmittel sondern ebenfalls Informationsquellen. Darüber hinaus gibt es eine Vielzahl an Anwendungen von Drittanbietern für diese Geräte. Dank der darin verbauten Sensoren, können darauf beispielsweise ortsbasierte Anwendungen, Gesundheitsanwendungen oder Anwendungen für die Industrie 4.0 ausgeführt werden, um nur einige zu nennen. Solche Anwendungen stellen allerdings nicht nur ein großes Nutzen-, sondern zu gleich ein immenses Gefahrenpotential dar. Über die Sensoren können die unterschiedlichsten Kontextdaten erfasst und relativ präzise Rückschlüsse auf den Nutzer gezogen werden. Daher sollte bei diesen Geräten ein besonderes Augenmerk auf die Datensicherheit und insbesondere auf den Datenschutz gelegt werden. Betrachtet man allerdings die bestehenden Datensicherheits- und Datenschutzkomponenten in den aktuell vorherrschenden mobilen Plattformen, so fällt auf, dass keine der Plattformen die speziellen Anforderungen an ein mobiles Datensicherheits- und Datenschutzsystem zufriedenstellend erfüllt. Aus diesem Grund steht im Zentrum der vorliegende Arbeit die Konzeption und Umsetzung neuartiger Datensicherheits- und Datenschutzkonzepte für mobile Anwendungen. Hierfür werden die folgenden fünf Forschungsbeiträge erbracht: [FB1] Bestehende Datensicherheits- und Datenschutzkonzepte werden analysiert, um deren Schwachstellen zu identifizieren. [FB2] Ein kontextsensitives Berechtigungsmodell wird erstellt. [FB3] Das Berechtigungsmodell wird in einem flexiblen Datenschutzsystem konzeptionell eingebettet und anschließend implementiert. [FB4] Das Datenschutzsystem wird zu einem holistischen Sicherheitssystem erweitert. [FB5] Das daraus entstandene holistische Sicherheitssystem wird evaluiert. Um die Forschungsziele zu erreichen, wird mit dem Privacy Policy Model (PPM) ein gänzlich neues Modell zur Formulierung von feingranularen Berechtigungsregeln eingeführt, die es dem Nutzer ermöglichen, je nach Bedarf, einzelne Funktionseinheiten einer Anwendung zu deaktivieren, um dadurch die Zugriffsrechte der Anwendung einzuschränken. Zusätzlich kann der Nutzer auch die Genauigkeit der Daten, die der Anwendung zur Verfügung gestellt werden, reduzieren. Das PPM wird in der Privacy Policy Platform (PMP) implementiert. Die PMP ist ein Berechtigungssystem, das nicht nur für die Einhaltung der Datenschutzrichtlinien sorgt, sondern auch einige der Schutzziele der Datensicherheit erfüllt. Für die PMP werden mehrere Implementierungsstrategien diskutiert und deren Vor- und Nachteile gegeneinander abgewogen. Um neben den Datenschutz auch die Datensicherheit gewährleisten zu können, wird die PMP um den Secure Data Container (SDC) erweitert. Mit dem SDC können sensible Daten sicher gespeichert und zwischen Anwendungen ausgetauscht werden. Die Anwendbarkeit der PMP und des SDCs wird an Praxisbeispielen aus vier unterschiedlichen Domänen (ortsbasierte Anwendungen, Gesundheitsanwendungen, Anwendungen in der Industrie 4.0 und Anwendungen für das Internet der Dinge) demonstriert. Bei dieser Analyse zeigt sich, dass die Kombination aus PMP und SDC nicht nur sämtliche Schutzziele, die im Rahmen der vorliegenden Arbeit relevant sind und sich am ISO-Standard ISO/IEC 27000:2009 orientieren, erfüllt, sondern darüber hinaus sehr performant ist. Durch die Verwendung der PMP und des SDCs kann der Akkuverbrauch von Anwendungen halbiert werden
    corecore