Search CORE

519 research outputs found

Dynamic data transformation for low latency querying in big data systems

Author: De Turck Filip
Ordonez Ante Leandro
Van Seghbroeck Gregory
Vanhove Thomas
Volckaert Bruno
Wauters Tim
Publication venue
Publication date: 01/01/2017
Field of study

Data Management in Microservices: State of the Practice, Challenges, and Research Directions

Author: Kalinowski Marcos
Laigner Rodrigo
Liu Yijian
Salles Marcos Antonio Vaz
Zhou Yongluan
Publication venue
Publication date: 01/01/2021
Field of study

We are recently witnessing an increased adoption of microservice architectures by the industry for achieving scalability by functional decomposition, fault-tolerance by deployment of small and independent services, and polyglot persistence by the adoption of different database technologies specific to the needs of each service. Despite the accelerating industrial adoption and the extensive research on microservices, there is a lack of thorough investigation on the state of the practice and the major challenges faced by practitioners with regard to data management. To bridge this gap, this paper presents a detailed investigation of data management in microservices. Our exploratory study is based on the following methodology: we conducted a systematic literature review of articles reporting the adoption of microservices in industry, where more than 300 articles were filtered down to 11 representative studies; we analyzed a set of 9 popular open-source microservice-based applications, selected out of more than 20 open-source projects; furthermore, to strengthen our evidence, we conducted an online survey that we then used to cross-validate the findings of the previous steps with the perceptions and experiences of over 120 practitioners and researchers. Through this process, we were able to categorize the state of practice and reveal several principled challenges that cannot be solved by software engineering practices, but rather need system-level support to alleviate the burden of practitioners. Based on the observations we also identified a series of research directions to achieve this goal. Fundamentally, novel database systems and data management tools that support isolation for microservices, which include fault isolation, performance isolation, data ownership, and independent schema evolution across microservices must be built to address the needs of this growing architectural style

arXiv.org e-Print Archive

Copenhagen University Research Information System

Repurposing Software Defenses with Specialized Hardware

Author: Sinha Kanad
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Computer security has largely been the domain of software for the last few decades. Although this approach has been moderately successful during this period, its problems have started becoming more apparent recently because of one primary reason — performance. Software solutions typically exact a significant toll in terms of program slowdown, especially when applied to large, complex software. In the past, when chips became exponentially faster, this growing burden could be accommodated almost for free. But as Moore’s law winds down, security-related slowdowns become more apparent, increasingly intolerable, and subsequently abandoned. As a result, the community has started looking elsewhere for continued protection, as attacks continue to become progressively more sophisticated. One way to mitigate this problem is to complement these defenses in hardware. Despite lacking the semantic perspective of high-level software, specialized hardware typically is not only faster, but also more energy-efficient. However, hardware vendors also have to factor in the cost of integrating security solutions from the perspective of effectiveness, longevity, and cost of development, while allaying the customer’s concerns of performance. As a result, although numerous hardware solutions have been proposed in the past, the fact that so few of them have actually transitioned into practice implies that they were unable to strike an optimal balance of the above qualities. This dissertation proposes the thesis that it is possible to add hardware features that complement and improve program security, traditionally provided by software, without requiring extensive modifications to existing hardware microarchitecture. As such, it marries the collective concerns of not only users and software developers, who demand performant but secure products, but also that of hardware vendors, since implementation simplicity directly relates to reduction in time and cost of development and deployment. To support this thesis, this dissertation discusses two hardware security features aimed at securing program code and data separately and details their full system implementations, and a study of a negative result where the design was deemed practically infeasible, given its high implementation complexity. Firstly, the dissertation discusses code protection by reviving instruction set randomization (ISR), an idea originally proposed for countering code injection and considered impractical in the face of modern attack vectors that employ reuse of existing program code (also known as code reuse attacks). With Polyglot, we introduce ISR with strong AES encryption along with basic code randomization that disallows code decryption at runtime, thus countering most forms of state-of-the-art dynamic code reuse attacks, that read the code at runtime prior to building the code reuse payload. Through various optimizations and corner case workarounds, we show how Polyglot enables code execution with minimal hardware changes while maintaining a small attack surface and incurring nominal overheads even when the code is strongly encrypted in the binary and memory. Next, the dissertation presents REST, a hardware primitive that allows programs to mark memory regions invalid for regular memory accesses. This is achieved simply by storing a large, pre-determined random value at those locations with a special store instruction and then, detecting incoming values at the data cache for matches to the predetermined value. Subsequently, we show how this primitive can be used to protect data from common forms of spatial and temporal memory safety attacks. Notably, because of the simplicity of the primitive, REST requires trivial microarchitectural modifications and hence, is easy to implement, and exhibits negligible performance overheads. Additionally, we demonstrate how it is able to provide practical heap safety even for legacy binaries. For the above proposals, we also detail their hardware implementations on FPGAs, and discuss how each fits within a complete multiprocess system. This serves to give the reader an idea of usage and deployment challenges on a broader scale that goes beyond just the technique’s effectiveness within the context of a single program. Lastly, the dissertation discusses an alternative to the virtual address space, that randomizes the sequence of addresses in a manner invisible to even the program, thus achieving transparent randomization of the entire address space at a very fine granularity. The biggest challenge is to achieve this with minimal microarchitectural changes while accommodating linear data structures in the program (e.g., arrays, structs), both of which are fundamentally based on a linear address space. As a result, this modified address space subsumes the benefits of most other spatial randomization schemes, with the additional benefit of ideally making traversal from one data structure to another impossible. Our study of this idea concludes that although valuable, current memory safety techniques are cheaper to implement and secure enough, so that there are no perceivable use cases for this model of address space safety

SoK: Security of Microservice Applications: A Practitioners' Perspective on Challenges and Best Practices

Author: Billawa Priyanka
Ferreyra Nicolás E. Díaz
Scandariato Riccardo
Simhandl Georg
Steghöfer Jan-Philipp
Tukaram Anusha Bambhore
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Cloud-based application deployment is becoming increasingly popular among businesses, thanks to the emergence of microservices. However, securing such architectures is a challenging task since traditional security concepts cannot be directly applied to microservice architectures due to their distributed nature. The situation is exacerbated by the scattered nature of guidelines and best practices advocated by practitioners and organizations in this field. This research paper we aim to shay light over the current microservice security discussions hidden within Grey Literature (GL) sources. Particularly, we identify the challenges that arise when securing microservice architectures, as well as solutions recommended by practitioners to address these issues. For this, we conducted a systematic GL study on the challenges and best practices of microservice security present in the Internet with the goal of capturing relevant discussions in blogs, white papers, and standards. We collected 312 GL sources from which 57 were rigorously classified and analyzed. This analysis on the one hand validated past academic literature studies in the area of microservice security, but it also identified improvements to existing methodologies pointing towards future research directions.Comment: Accepted at the 17th International Conference on Availability, Reliability and Security (ARES 2022

arXiv.org e-Print Archive

Chalmers Research

Data transformation as a means towards dynamic data storage and polyglot persistence

Author: Bellahsene
Cattell
Chen
Chodorow
Corbett
Das
Dean
Gruber
Hausenblas
Lakshman
Leonard
Liao
Marz
Parr
Rahm
Sadalage
Sellami
Shu
Shute
Ting
Vanhove
Vassiliadis
Zhang
Zheng
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

Legacy applications have been built around the concept of storing their data in one relational data store. However, with the current differentiation in data store technologies as a consequence of the NoSQL paradigm, new and possibly more performant storage solutions are available to all applications. The concept of dynamic storage makes sure that application data are always stored in the most optimal data store at a given time to increase application performance. Additionally, polyglot persistence aims to push this performance even further by storing each different data type of an application in the data store technology best suited for it. To get legacy applications into dynamic storage and polyglot persistence, schema and data transformations between data store technologies are needed. This usually infers application redesigns as well to support the new data stores. This paper proposes such a transformation approach through a canonical model. It is based on the Lambda architecture to ensure no application downtime is needed during the transformation process, and after the transformation, the application can continue to query in the original query language, thus requiring no application code changes

Icarus: Towards a Multistore Database System

Author: Schuldt Heiko
Stiemer Alexander
Vogt Marco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

The last years have seen a vast diversification on the database market. In contrast to the "one-size-fits-all" paradigm according to which systems have been designed in the past, today's database management systems (DBMSs) are tuned for particular workloads. This has led to DBMSs optimized for high performance, high throughput read/write workload in online transaction processing (OLTP) and systems optimized for complex analytical queries (OLAP). However, this approach reaches a limit when systems have to deal with mixed workloads that are neither pure OLAP nor pure OLTP workloads. In such cases, polystores are increasingly gaining popularity. Rather than supporting one single database paradigm and addressing one particular workload, polystores encompass several DBMSs that store data in different schemas and allow to route requests at a per-query-level to the most appropriate system. In this paper, we introduce the polystore Icarus. In our evaluation based on a workload that combines OLTP and OLAP elements, We show that Icarus is able to speed-up queries up to a factor of 3 by properly routing queries to the best underlying DBMS

edoc

A model-based design flow for embedded vision applications on heterogeneous architectures

Author: Aldegheri Stefano
Publication venue
Publication date: 01/01/2020
Field of study

The ability to gather information from images is straightforward to human, and one of the principal input to understand external world. Computer vision (CV) is the process to extract such knowledge from the visual domain in an algorithmic fashion. The requested computational power to process these information is very high. Until recently, the only feasible way to meet non-functional requirements like performance was to develop custom hardware, which is costly, time-consuming and can not be reused in a general purpose. The recent introduction of low-power and low-cost heterogeneous embedded boards, in which CPUs are combine with heterogeneous accelerators like GPUs, DSPs and FPGAs, can combine the hardware efficiency needed for non-functional requirements with the flexibility of software development. Embedded vision is the term used to identify the application of the aforementioned CV algorithms applied in the embedded field, which usually requires to satisfy, other than functional requirements, also non-functional requirements such as real-time performance, power, and energy efficiency. Rapid prototyping, early algorithm parametrization, testing, and validation of complex embedded video applications for such heterogeneous architectures is a very challenging task. This thesis presents a comprehensive framework that: 1) Is based on a model-based paradigm. Differently from the standard approaches at the state of the art that require designers to manually model the algorithm in any programming language, the proposed approach allows for a rapid prototyping, algorithm validation and parametrization in a model-based design environment (i.e., Matlab/Simulink). The framework relies on a multi-level design and verification flow by which the high-level model is then semi-automatically refined towards the final automatic synthesis into the target hardware device. 2) Relies on a polyglot parallel programming model. The proposed model combines different programming languages and environments such as C/C++, OpenMP, PThreads, OpenVX, OpenCV, and CUDA to best exploit different levels of parallelism while guaranteeing a semi-automatic customization. 3) Optimizes the application performance and energy efficiency through a novel algorithm for the mapping and scheduling of the application 3 tasks on the heterogeneous computing elements of the device. Such an algorithm, called exclusive earliest finish time (XEFT), takes into consideration the possible multiple implementation of tasks for different computing elements (e.g., a task primitive for CPU and an equivalent parallel implementation for GPU). It introduces and takes advantage of the notion of exclusive overlap between primitives to improve the load balancing. This thesis is the result of three years of research activity, during which all the incremental steps made to compose the framework have been tested on real case studie

Towards Next Generation Business Process Model Repositories – A Technical Perspective on Loading and Processing of Process Models

Author: Dietrich Hanns-Alexander
Publication venue: AIS Electronic Library (AISeL)
Publication date: 18/06/2013
Field of study

Business process management repositories manage large collections of process models ranging in the thousands. Additionally, they provide management functions like e.g. mining, querying, merging and variants management for process models. However, most current business process management repositories are built on top of relation database management systems (RDBMS) although this leads to performance issues. These issues result from the relational algebra, the mismatch between relational tables and object oriented programming (impedance mismatch) as well as new technological developments in the last 30 years as e.g. more and cheap disk and memory space, clusters and clouds. The goal of this paper is to present current paradigms to overcome the performance problems inherent in RDBMS. Therefore, we have to fuse research about data modeling along database technologies as well as algorithm design and parallelization for the technology paradigms occurring nowadays. Based on these research streams we have shown how the performance of business process management repositories could be improved in terms of loading performance of processes (from e.g. a disk) and the computation of management techniques resulting in even faster application of such a technique. Exemplarily, applications of the compiled paradigms are presented to show their applicability

AIS Electronic Library (AISeL)