352 research outputs found
Secure storage systems for untrusted cloud environments
The cloud has become established for applications that need to be scalable and highly
available. However, moving data to data centers owned and operated by a third party,
i.e., the cloud provider, raises security concerns because a cloud provider could easily
access and manipulate the data or program flow, preventing the cloud from being
used for certain applications, like medical or financial.
Hardware vendors are addressing these concerns by developing Trusted Execution
Environments (TEEs) that make the CPU state and parts of memory inaccessible from
the host software. While TEEs protect the current execution state, they do not provide
security guarantees for data which does not fit nor reside in the protected memory
area, like network and persistent storage.
In this work, we aim to address TEEsâ limitations in three different ways, first we
provide the trust of TEEs to persistent storage, second we extend the trust to multiple
nodes in a network, and third we propose a compiler-based solution for accessing
heterogeneous memory regions. More specifically,
⢠SPEICHER extends the trust provided by TEEs to persistent storage. SPEICHER
implements a key-value interface. Its design is based on LSM data structures, but
extends them to provide confidentiality, integrity, and freshness for the stored
data. Thus, SPEICHER can prove to the client that the data has not been tampered
with by an attacker.
⢠AVOCADO is a distributed in-memory key-value store (KVS) that extends the
trust that TEEs provide across the network to multiple nodes, allowing KVSs to
scale beyond the boundaries of a single node. On each node, AVOCADO carefully
divides data between trusted memory and untrusted host memory, to maximize
the amount of data that can be stored on each node. AVOCADO leverages the
fact that we can model network attacks as crash-faults to trust other nodes with
a hardened ABD replication protocol.
⢠TOAST is based on the observation that modern high-performance systems
often use several different heterogeneous memory regions that are not easily
distinguishable by the programmer. The number of regions is increased by the
fact that TEEs divide memory into trusted and untrusted regions. TOAST is a
compiler-based approach to unify access to different heterogeneous memory
regions and provides programmability and portability. TOAST uses a
load/store interface to abstract most library interfaces for different memory
regions
Recommended from our members
Complaint Driven Training Data Debugging for Machine Learning Workflows
As the need for machine learning (ML) increases rapidly across all industry sectors, so has theinterest in building ML platforms that manage and automate parts of the ML life-cycle. This has enabled companies to use ML inference as a part of their downstream analytics or their applications. Unfortunately, debugging unexpected outcomes in the result of these ML workflows remains a necessary but difficult task of the ML life-cycle. The challenge of debugging ML workflows is that it requires reasoning about the correctness of the workflow logic, the datasets used for inference and training, the models, and interactions between them. Even if the workflow logic is correct, errors in the data used across the ML workflow can still lead to wrong outcomes. In short, developers are not just debugging the code, but also the data.
We advocate in favor of a complaint driven approach towards specifying and debugging data errors in ML workflows. The approach takes as input user specified complaints specified as constraints over the final or intermediate outputs of workflows that use trained ML models. The approach outputs explanations in the form of specific operator(s) or data subsets, and how they may be changed to address the constraint violations.
In this thesis we make the first steps towards our complaint driven approach to data debugging. As a stepping stone, we focus our attention on complaints specified on top of relational workflows that use ML model inference and whose errors are caused by errors in ML modelâs training data. To the best of our knowledge, we contribute the first debugging system for this task, which we call Rain. In response to a user complaint, Rain ranks the ML modelâs training examples based on their ability to address the userâs complaint if they were removed. Our experiments show that users can use Rain to debug training data errors by specifying complaints over aggregations of model predictions without having to specify the correct label for each individual prediction.
Unfortunately, Rainâs latency may be prohibitive for use in interactive applications like analytical dashboards or business intelligence tools where users are likely to observe errors and complain. To address Rainâs latency problem when scaling to large ML models and training sets, we propose Rain++. Rain++ pushes the majority of Rainâs computation offline ahead of user interaction, achieving orders of magnitude online latency improvements compared to Rain.
To go beyond Rainâs and Rain++âs approach that evaluates individual training example deletionsindependently we propose MetaRain, a framework for training classifiers that detect training data corruptions in response to user complaints. Thanks to the generality of MetaRain, users can adapt the classifiers chosen to the training corruptions and the complaints they seek to resolve. Our experiments indicate that making use of this ability results in improved debugging outcomes.
Last but not least, we study the problem of updating relational workflow results in response tochanges to the inference ML model used. This can be leveraged by current or future complaint driven debugging systems that repeatedly change the model and reevaluate the relational workflow. We propose FaDE, a compiler that generates efficient code for the workflow update problem by casting it as view maintenance under input tuple deletions. Our experiments indicate that the code generated by FaDE has orders of magnitude lower latency than existing view maintenance systems
Geographic information extraction from texts
A large volume of unstructured texts, containing valuable geographic information, is available online. This information â provided implicitly or explicitly â is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
SoK: Distributed Computing in ICN
Information-Centric Networking (ICN), with its data-oriented operation and
generally more powerful forwarding layer, provides an attractive platform for
distributed computing. This paper provides a systematic overview and
categorization of different distributed computing approaches in ICN
encompassing fundamental design principles, frameworks and orchestration,
protocols, enablers, and applications. We discuss current pain points in legacy
distributed computing, attractive ICN features, and how different systems use
them. This paper also provides a discussion of potential future work for
distributed computing in ICN.Comment: 10 pages, 3 figures, 1 table. Accepted by ACM ICN 202
Fundamentals
Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters
Scalable and fault-tolerant data stream processing on multi-core architectures
With increasing data volumes and velocity, many applications are shifting from the classical âprocess-after-storeâ paradigm to a stream processing model: data is produced and consumed as continuous streams. Stream processing captures latency-sensitive applications as diverse as credit card fraud detection and high-frequency trading. These applications are expressed as queries of algebraic operations (e.g., aggregation) over the most recent data using windows, i.e., finite evolving views over the input streams. To guarantee correct results, streaming applications require precise window semantics (e.g., temporal ordering) for operations that maintain state.
While high processing throughput and low latency are performance desiderata for stateful streaming applications, achieving both poses challenges. Computing the state of overlapping windows causes redundant aggregation operations: incremental execution (i.e., reusing previous results) reduces latency but prevents parallelization; at the same time, parallelizing window execution for stateful operations with precise semantics demands ordering guarantees and state access coordination. Finally, streams and state must be recovered to produce consistent and repeatable results in the event of failures.
Given the rise of shared-memory multi-core CPU architectures and high-speed networking, we argue that it is possible to address these challenges in a single node without compromising window semantics, performance, or fault-tolerance. In this thesis, we analyze, design, and implement stream processing engines (SPEs) that achieve high performance on multi-core architectures. To this end, we introduce new approaches for in-memory processing that address the previous challenges: (i) for overlapping windows, we provide a family of window aggregation techniques that enable computation sharing based on the algebraic properties of aggregation functions; (ii) for parallel window execution, we balance parallelism and incremental execution by developing abstractions for both and combining them to a novel design; and (iii) for reliable single-node execution, we enable strong fault-tolerance guarantees without sacrificing performance by reducing the required disk I/O bandwidth using a novel persistence model. We combine the above to implement an SPE that processes hundreds of millions of tuples per second with sub-second latencies. These results reveal the opportunity to reduce resource and maintenance footprint by replacing cluster-based SPEs with single-node deployments.Open Acces
A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack
With the ever-increasing amount of data generate in the world, estimated to reach over 200 Zettabytes by 2025, pressure on efficient data storage systems is intensifying. The shift from HDD to flash-based SSD provides one of the most fundamental shifts in storage technology, increasing performance capabilities significantly. However, flash storage comes with different characteristics than prior HDD storage technology. Therefore, storage software was unsuitable for leveraging the capabilities of flash storage. As a result, a plethora of storage applications have been design to better integrate with flash storage and align with flash characteristics. In this literature study we evaluate the effect the introduction of flash storage has had on the design of file systems, which providing one of the most essential mechanisms for managing persistent storage. We analyze the mechanisms for effectively managing flash storage, managing overheads of introduced design requirements, and leverage the capabilities of flash storage. Numerous methods have been adopted in file systems, however prominently revolve around similar design decisions, adhering to the flash hardware constrains, and limiting software intervention. Future design of storage software remains prominent with the constant growth in flash-based storage devices and interfaces, providing an increasing possibility to enhance flash integration in the host storage software stack
A Survey on the Integration of NAND Flash Storage in the Design of File Systems and the Host Storage Software Stack
With the ever-increasing amount of data generate in the world, estimated to
reach over 200 Zettabytes by 2025, pressure on efficient data storage systems
is intensifying. The shift from HDD to flash-based SSD provides one of the most
fundamental shifts in storage technology, increasing performance capabilities
significantly. However, flash storage comes with different characteristics than
prior HDD storage technology. Therefore, storage software was unsuitable for
leveraging the capabilities of flash storage. As a result, a plethora of
storage applications have been design to better integrate with flash storage
and align with flash characteristics.
In this literature study we evaluate the effect the introduction of flash
storage has had on the design of file systems, which providing one of the most
essential mechanisms for managing persistent storage. We analyze the mechanisms
for effectively managing flash storage, managing overheads of introduced design
requirements, and leverage the capabilities of flash storage. Numerous methods
have been adopted in file systems, however prominently revolve around similar
design decisions, adhering to the flash hardware constrains, and limiting
software intervention. Future design of storage software remains prominent with
the constant growth in flash-based storage devices and interfaces, providing an
increasing possibility to enhance flash integration in the host storage
software stack
Design and Code Optimization for Systems with Next-generation Racetrack Memories
With the rise of computationally expensive application domains such as machine learning, genomics, and fluids simulation, the quest for performance and energy-efficient computing has gained unprecedented momentum. The significant increase in computing and memory devices in modern systems has resulted in an unsustainable surge in energy consumption, a substantial portion of which is attributed to the memory system. The scaling of conventional memory technologies and their suitability for the next-generation system is also questionable. This has led to the emergence and rise of nonvolatile memory ( NVM ) technologies. Today, in different development stages, several NVM technologies are competing for their rapid access to the market.
Racetrack memory ( RTM ) is one such nonvolatile memory technology that promises SRAM -comparable latency, reduced energy consumption, and unprecedented density compared to other technologies. However, racetrack memory ( RTM ) is sequential in nature, i.e., data in an RTM cell needs to be shifted to an access port before it can be accessed. These shift operations incur performance and energy penalties. An ideal RTM , requiring at most one shift per access, can easily outperform SRAM . However, in the worst-cast shifting scenario, RTM can be an order of magnitude slower than SRAM .
This thesis presents an overview of the RTM device physics, its evolution, strengths and challenges, and its application in the memory subsystem. We develop tools that allow the programmability and modeling of RTM -based systems. For shifts minimization, we propose a set of techniques including optimal, near-optimal, and evolutionary algorithms for efficient scalar and instruction placement in RTMs . For array accesses, we explore schedule and layout transformations that eliminate the longer overhead shifts in RTMs . We present an automatic compilation framework that analyzes static control flow programs and transforms the loop traversal order and memory layout to maximize accesses to consecutive RTM locations and minimize shifts. We develop a simulation framework called RTSim that models various RTM parameters and enables accurate architectural level simulation.
Finally, to demonstrate the RTM potential in non-Von-Neumann in-memory computing paradigms, we exploit its device attributes to implement logic and arithmetic operations. As a concrete use-case, we implement an entire hyperdimensional computing framework in RTM to accelerate the language recognition problem. Our evaluation shows considerable performance and energy improvements compared to conventional Von-Neumann models and state-of-the-art accelerators
- âŚ