579 research outputs found

    A modern approach for Threat Modelling in agile environments: redesigning the process in a SaaS company

    Get PDF
    Dealing with security aspects has become one of the priorities for companies operating in every sector. In the software industry building security requires being proactive and preventive by incorporating requirements right from the ideation and design of the product. Threat modelling has been consistently proven as one of the most effective and rewarding security activities in doing that, being able to uncover threats and vulnerabilities before they are even introduced into the codebase. Numerous approaches to conduct such exercise have been proposed over time, however, most of them can not be adopted in intricate corporate environments with multiple development teams. This is clear by analysing the case of Company Z, which introduced a well-documented process in 2019 but scalability, governance and knowledge issues blocked a widespread adoption. The main goal of the Thesis was to overcome these problems by designing a novel threat modelling approach, able to fit the company’s Agile environment and capable of closing the current gaps. As a result, a complete description of the redefined workflow and a structured set of suggestions was proposed. The solution is flexible enough to be adopted in multiple different contexts while meeting the requirements of Company Z. Achieving this result was possible only by analysing the industry’s best practices and solutions, understanding the current process, identifying the pain points, and gathering feedback from stakeholders. The solution proposed includes, alongside the new threat modelling process, a comprehensive method for evaluating and verifying the effectiveness of the proposed solution

    Provenance in Collaborative Data Sharing

    Get PDF
    This dissertation focuses on recording, maintaining and exploiting provenance information in Collaborative Data Sharing Systems (CDSS). These are systems that support data sharing across loosely-coupled, heterogeneous collections of relational databases related by declarative schema mappings. A fundamental challenge in a CDSS is to support the capability of update exchange --- which publishes a participant\u27s updates and then translates others\u27 updates to the participant\u27s local schema and imports them --- while tolerating disagreement between them and recording the provenance of exchanged data, i.e., information about the sources and mappings involved in their propagation. This provenance information can be useful during update exchange, e.g., to evaluate provenance-based trust policies. It can also be exploited after update exchange, to answer a variety of user queries, about the quality, uncertainty or authority of the data, for applications such as trust assessment, ranking for keyword search over databases, or query answering in probabilistic databases. To address these challenges, in this dissertation we develop a novel model of provenance graphs that is informative enough to satisfy the needs of CDSS users and captures the semantics of query answering on various forms of annotated relations. We extend techniques from data integration, data exchange, incremental view maintenance and view update to define the formal semantics of unidirectional and bidirectional update exchange. We develop algorithms to perform update exchange incrementally while maintaining provenance information. We present strategies for implementing our techniques over an RDBMS and experimentally demonstrate their viability in the Orchestra prototype system. We define ProQL, a query language for provenance graphs that can be used by CDSS users to combine data querying with provenance testing as well as to compute annotations for their data, based on their provenance, that are useful for a variety of applications. Finally, we develop a prototype implementation ProQL over an RDBMS and indexing techniques to speed up provenance querying, evaluate experimentally the performance of provenance querying and the benefits of our indexing techniques

    Doctor of Philosophy

    Get PDF
    dissertationAs the base of the software stack, system-level software is expected to provide ecient and scalable storage, communication, security and resource management functionalities. However, there are many computationally expensive functionalities at the system level, such as encryption, packet inspection, and error correction. All of these require substantial computing power. What's more, today's application workloads have entered gigabyte and terabyte scales, which demand even more computing power. To solve the rapidly increased computing power demand at the system level, this dissertation proposes using parallel graphics pro- cessing units (GPUs) in system software. GPUs excel at parallel computing, and also have a much faster development trend in parallel performance than central processing units (CPUs). However, system-level software has been originally designed to be latency-oriented. GPUs are designed for long-running computation and large-scale data processing, which are throughput-oriented. Such mismatch makes it dicult to t the system-level software with the GPUs. This dissertation presents generic principles of system-level GPU computing developed during the process of creating our two general frameworks for integrating GPU computing in storage and network packet processing. The principles are generic design techniques and abstractions to deal with common system-level GPU computing challenges. Those principles have been evaluated in concrete cases including storage and network packet processing applications that have been augmented with GPU computing. The signicant performance improvement found in the evaluation shows the eectiveness and eciency of the proposed techniques and abstractions. This dissertation also presents a literature survey of the relatively young system-level GPU computing area, to introduce the state of the art in both applications and techniques, and also their future potentials

    Quality of Service Aware Data Stream Processing for Highly Dynamic and Scalable Applications

    Get PDF
    Huge amounts of georeferenced data streams are arriving daily to data stream management systems that are deployed for serving highly scalable and dynamic applications. There are innumerable ways at which those loads can be exploited to gain deep insights in various domains. Decision makers require an interactive visualization of such data in the form of maps and dashboards for decision making and strategic planning. Data streams normally exhibit fluctuation and oscillation in arrival rates and skewness. Those are the two predominant factors that greatly impact the overall quality of service. This requires data stream management systems to be attuned to those factors in addition to the spatial shape of the data that may exaggerate the negative impact of those factors. Current systems do not natively support services with quality guarantees for dynamic scenarios, leaving the handling of those logistics to the user which is challenging and cumbersome. Three workloads are predominant for any data stream, batch processing, scalable storage and stream processing. In this thesis, we have designed a quality of service aware system, SpatialDSMS, that constitutes several subsystems that are covering those loads and any mixed load that results from intermixing them. Most importantly, we natively have incorporated quality of service optimizations for processing avalanches of geo-referenced data streams in highly dynamic application scenarios. This has been achieved transparently on top of the codebases of emerging de facto standard best-in-class representatives, thus relieving the overburdened shoulders of the users in the presentation layer from having to reason about those services. Instead, users express their queries with quality goals and our system optimizers compiles that down into query plans with an embedded quality guarantee and leaves logistic handling to the underlying layers. We have developed standard compliant prototypes for all the subsystems that constitutes SpatialDSMS

    Parallel genetic algorithms in the cloud

    Get PDF
    2015 - 2016Genetic Algorithms (GAs) are a metaheuristic search technique belonging to the class of Evolutionary Algorithms (EAs). They have been proven to be effective in addressing several problems in many fields but also suffer from scalability issues that may not let them find a valid application for real world problems. Thus, the aim of providing highly scalable GA-based solutions, together with the reduced costs of parallel architectures, motivate the research on Parallel Genetic Algorithms (PGAs). Cloud computing may be a valid option for parallelisation, since there is no need of owning the physical hardware, which can be purchased from cloud providers, for the desired time, quantity and quality. There are different employable cloud technologies and approaches for this purpose, but they all introduce communication overhead. Thus, one might wonder if, and possibly when, specific approaches, environments and models show better performance than sequential versions in terms of execution time and resource usage. This thesis investigates if and when GAs can scale in the cloud using specific approaches. Firstly, Hadoop MapReduce is exploited designing and developinganopensourceframework,i.e.,elephant56, thatreducestheeffortin developing and speed up GAs using three parallel models. The performance of theframeworkisthenevaluatedthroughanempiricalstudy. Secondly, software containers and message queues are employed to develop, deploy and execute PGAs in the cloud and the devised system is evaluated with an empirical study on a commercial cloud provider. Finally, cloud technologies are also exploredfortheparallelisationofotherEAs,designinganddevelopingcCube,a collaborativemicroservicesarchitectureformachinelearningproblems. [edited by author]I Genetic Algorithms (GAs) sono una metaeuristica di ricerca appartenenti alla classe degli Evolutionary Algorithms (EAs). Si sono dimostrati efficaci nel risolvere tanti problemi in svariati campi. Tuttavia, le difficoltà nello scalare spesso evitano che i GAs possano trovare una collocazione efficace per la risoluzione di problemi del mondo reale. Quindi, l’obiettivo di fornire soluzioni basate altamente scalabili, assieme alla riduzione dei costi di architetture parallele, motivano la ricerca sui Parallel Genetic Algorithms (PGAs). Il cloud computing potrebbe essere una valida opzione per la parallelizzazione, dato che non c’è necessità di possedere hardware fisico che può, invece, essere acquistato dai cloud provider, per il tempo desiderato, quantità e qualità. Esistono differenti tecnologie e approcci cloud impiegabili a tal proposito ma, tutti, introducono overhead di computazione. Quindi, ci si può chiedere se, e possibilmente quando, approcci specifici, ambienti e modelli mostrino migliori performance rispetto alle versioni sequenziali, in termini di tempo di esecuzione e uso di risorse. Questa tesi indaga se, e quando, i GAs possono scalare nel cloud utilizzando approcci specifici. Prima di tutto, Hadoop MapReduce è sfruttato per modellare e sviluppare un framework open source, i.e., elephant56, che riduce l’effort nello sviluppo e velocizza i GAs usando tre diversi modelli paralleli. Le performance del framework sono poi valutate attraverso uno studio empirico. Successivamente, i software container e le message queue sono impiegati per sviluppare, distribuire e eseguire PGAs e il sistema ideato valutato, attraverso uno studio empirico, su un cloud provider commerciale. Infine, le tecnologie cloud sono esplorate per la parallelizzazione di altri EAs, ideando e sviluppando cCube, un’architettura a microservizi collaborativa per risolvere problemi di machine learning. [a cura dell'autore]XV n.s

    Application-aware optimization of Artificial Intelligence for deployment on resource constrained devices

    Get PDF
    Artificial intelligence (AI) is changing people's everyday life. AI techniques such as Deep Neural Networks (DNN) rely on heavy computational models, which are in principle designed to be executed on powerful HW platforms, such as desktop or server environments. However, the increasing need to apply such solutions in people's everyday life has encouraged the research for methods to allow their deployment on embedded, portable and stand-alone devices, such as mobile phones, which exhibit relatively low memory and computational resources. Such methods targets both the development of lightweight AI algorithms and their acceleration through dedicated HW. This thesis focuses on the development of lightweight AI solutions, with attention to deep neural networks, to facilitate their deployment on resource constrained devices. Focusing on the computer vision field, we show how putting together the self learning ability of deep neural networks with application-specific knowledge, in the form of feature engineering, it is possible to dramatically reduce the total memory and computational burden, thus allowing the deployment on edge devices. The proposed approach aims to be complementary to already existing application-independent network compression solutions. In this work three main DNN optimization goals have been considered: increasing speed and accuracy, allowing training at the edge, and allowing execution on a microcontroller. For each of these we deployed the resulting algorithm to the target embedded device and measured its performance
    corecore