1,925 research outputs found

    Secure storage systems for untrusted cloud environments

    Get PDF
    The cloud has become established for applications that need to be scalable and highly available. However, moving data to data centers owned and operated by a third party, i.e., the cloud provider, raises security concerns because a cloud provider could easily access and manipulate the data or program flow, preventing the cloud from being used for certain applications, like medical or financial. Hardware vendors are addressing these concerns by developing Trusted Execution Environments (TEEs) that make the CPU state and parts of memory inaccessible from the host software. While TEEs protect the current execution state, they do not provide security guarantees for data which does not fit nor reside in the protected memory area, like network and persistent storage. In this work, we aim to address TEEs’ limitations in three different ways, first we provide the trust of TEEs to persistent storage, second we extend the trust to multiple nodes in a network, and third we propose a compiler-based solution for accessing heterogeneous memory regions. More specifically, • SPEICHER extends the trust provided by TEEs to persistent storage. SPEICHER implements a key-value interface. Its design is based on LSM data structures, but extends them to provide confidentiality, integrity, and freshness for the stored data. Thus, SPEICHER can prove to the client that the data has not been tampered with by an attacker. • AVOCADO is a distributed in-memory key-value store (KVS) that extends the trust that TEEs provide across the network to multiple nodes, allowing KVSs to scale beyond the boundaries of a single node. On each node, AVOCADO carefully divides data between trusted memory and untrusted host memory, to maximize the amount of data that can be stored on each node. AVOCADO leverages the fact that we can model network attacks as crash-faults to trust other nodes with a hardened ABD replication protocol. • TOAST is based on the observation that modern high-performance systems often use several different heterogeneous memory regions that are not easily distinguishable by the programmer. The number of regions is increased by the fact that TEEs divide memory into trusted and untrusted regions. TOAST is a compiler-based approach to unify access to different heterogeneous memory regions and provides programmability and portability. TOAST uses a load/store interface to abstract most library interfaces for different memory regions

    Shufflecake: Plausible Deniability for Multiple Hidden Filesystems on Linux

    Get PDF
    We present Shufflecake, a new plausible deniability design to hide the existence of encrypted data on a storage medium making it very difficult for an adversary to prove the existence of such data. Shufflecake can be considered a ``spiritual successor\u27\u27 of tools such as TrueCrypt and VeraCrypt, but vastly improved: it works natively on Linux, it supports any filesystem of choice, and can manage multiple volumes per device, so to make deniability of the existence of hidden partitions really plausible. Compared to ORAM-based solutions, Shufflecake is extremely fast and simpler but does not offer native protection against multi-snapshot adversaries. However, we discuss security extensions that are made possible by its architecture, and we show evidence why these extensions might be enough to thwart more powerful adversaries. We implemented Shufflecake as an in-kernel tool for Linux, adding useful features, and we benchmarked its performance showing only a minor slowdown compared to a base encrypted system. We believe Shufflecake represents a useful tool for people whose freedom of expression is threatened by repressive authorities or dangerous criminal organizations, in particular: whistleblowers, investigative journalists, and activists for human rights in oppressive regimes

    DORAM revisited: Maliciously secure RAM-MPC with logarithmic overhead

    Get PDF
    Distributed Oblivious Random Access Memory (DORAM) is a secure multiparty protocol that allows a group of participants holding a secret-shared array to read and write to secret-shared locations within the array. The efficiency of a DORAM protocol is measured by the amount of communication and computation required per read/write query into the array. DORAM protocols are a necessary ingredient for executing Secure Multiparty Computation (MPC) in the RAM model. Although DORAM has been widely studied, all existing DORAM protocols have focused on the setting where the DORAM servers are semi-honest. Generic techniques for upgrading a semi-honest DORAM protocol to the malicious model typically increase the asymptotic communication complexity of the DORAM scheme. In this work, we present a 3-party DORAM protocol which requires O((κ+D)logN)O((\kappa + D)\log N) communication and computation per query, for a database of size NN with DD-bit values, where κ\kappa is the security parameter. Our hidden constants in a big-O nation are small. We show that our protocol is UC-secure in the presence of a malicious, static adversary. This matches the communication and computation complexity of the best semi-honest DORAM protocol, and is the first malicious DORAM protocol with this complexity

    Towards Fast and Scalable Private Inference

    Full text link
    Privacy and security have rapidly emerged as first order design constraints. Users now demand more protection over who can see their data (confidentiality) as well as how it is used (control). Here, existing cryptographic techniques for security fall short: they secure data when stored or communicated but must decrypt it for computation. Fortunately, a new paradigm of computing exists, which we refer to as privacy-preserving computation (PPC). Emerging PPC technologies can be leveraged for secure outsourced computation or to enable two parties to compute without revealing either users' secret data. Despite their phenomenal potential to revolutionize user protection in the digital age, the realization has been limited due to exorbitant computational, communication, and storage overheads. This paper reviews recent efforts on addressing various PPC overheads using private inference (PI) in neural network as a motivating application. First, the problem and various technologies, including homomorphic encryption (HE), secret sharing (SS), garbled circuits (GCs), and oblivious transfer (OT), are introduced. Next, a characterization of their overheads when used to implement PI is covered. The characterization motivates the need for both GCs and HE accelerators. Then two solutions are presented: HAAC for accelerating GCs and RPU for accelerating HE. To conclude, results and effects are shown with a discussion on what future work is needed to overcome the remaining overheads of PI.Comment: Appear in the 20th ACM International Conference on Computing Frontier

    Adaptive Microarchitectural Optimizations to Improve Performance and Security of Multi-Core Architectures

    Get PDF
    With the current technological barriers, microarchitectural optimizations are increasingly important to ensure performance scalability of computing systems. The shift to multi-core architectures increases the demands on the memory system, and amplifies the role of microarchitectural optimizations in performance improvement. In a multi-core system, microarchitectural resources are usually shared, such as the cache, to maximize utilization but sharing can also lead to contention and lower performance. This can be mitigated through partitioning of shared caches.However, microarchitectural optimizations which were assumed to be fundamentally secure for a long time, can be used in side-channel attacks to exploit secrets, as cryptographic keys. Timing-based side-channels exploit predictable timing variations due to the interaction with microarchitectural optimizations during program execution. Going forward, there is a strong need to be able to leverage microarchitectural optimizations for performance without compromising security. This thesis contributes with three adaptive microarchitectural resource management optimizations to improve security and/or\ua0performance\ua0of multi-core architectures\ua0and a systematization-of-knowledge of timing-based side-channel attacks.\ua0We observe that to achieve high-performance cache partitioning in a multi-core system\ua0three requirements need to be met: i) fine-granularity of partitions, ii) locality-aware placement and iii) frequent changes. These requirements lead to\ua0high overheads for current centralized partitioning solutions, especially as the number of cores in the\ua0system increases. To address this problem, we present an adaptive and scalable cache partitioning solution (DELTA) using a distributed and asynchronous allocation algorithm. The\ua0allocations occur through core-to-core challenges, where applications with larger performance benefit will gain cache capacity. The\ua0solution is implementable in hardware, due to low computational complexity, and can scale to large core counts.According to our analysis, better performance can be achieved by coordination of multiple optimizations for different resources, e.g., off-chip bandwidth and cache, but is challenging due to the increased number of possible allocations which need to be evaluated.\ua0Based on these observations, we present a solution (CBP) for coordinated management of the optimizations: cache partitioning, bandwidth partitioning and prefetching.\ua0Efficient allocations, considering the inter-resource interactions and trade-offs, are achieved using local resource managers to limit the solution space.The continuously growing number of\ua0side-channel attacks leveraging\ua0microarchitectural optimizations prompts us to review attacks and defenses to understand the vulnerabilities of different microarchitectural optimizations. We identify the four root causes of timing-based side-channel attacks: determinism, sharing, access violation\ua0and information flow.\ua0Our key insight is that eliminating any of the exploited root causes, in any of the attack steps, is enough to provide protection.\ua0Based on our framework, we present a systematization of the attacks and defenses on a wide range of microarchitectural optimizations, which highlights their key similarities.\ua0Shared caches are an attractive attack surface for side-channel attacks, while defenses need to be efficient since the cache is crucial for performance.\ua0To address this issue, we present an adaptive and scalable cache partitioning solution (SCALE) for protection against cache side-channel attacks. The solution leverages randomness,\ua0and provides quantifiable and information theoretic security guarantees using differential privacy. The solution closes the performance gap to a state-of-the-art non-secure allocation policy for a mix of secure and non-secure applications

    ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching

    Full text link
    Security critical software, e.g., OpenSSL, comes with numerous side-channel leakages left unpatched due to a lack of resources or experts. The situation will only worsen as the pace of code development accelerates, with developers relying on Large Language Models (LLMs) to automatically generate code. In this work, we explore the use of LLMs in generating patches for vulnerable code with microarchitectural side-channel leakages. For this, we investigate the generative abilities of powerful LLMs by carefully crafting prompts following a zero-shot learning approach. All generated code is dynamically analyzed by leakage detection tools, which are capable of pinpointing information leakage at the instruction level leaked either from secret dependent accesses or branches or vulnerable Spectre gadgets, respectively. Carefully crafted prompts are used to generate candidate replacements for vulnerable code, which are then analyzed for correctness and for leakage resilience. From a cost/performance perspective, the GPT4-based configuration costs in API calls a mere few cents per vulnerability fixed. Our results show that LLM-based patching is far more cost-effective and thus provides a scalable solution. Finally, the framework we propose will improve in time, especially as vulnerability detection tools and LLMs mature

    Drones, Signals, and the Techno-Colonisation of Landscape

    Get PDF
    This research project is a cross-disciplinary, creative practice-led investigation that interrogates increasing military interest in the electromagnetic spectrum (EMS). The project’s central argument is that painted visualisations of normally invisible aspects of contemporary EMS-enabled warfare can reveal useful, novel, and speculative but informed perspectives that contribute to debates about war and technology. It pays particular attention to how visualising normally invisible signals reveals an insidious techno-colonisation of our extended environment from Earth to orbiting satellites

    Waks-On/Waks-Off: Fast Oblivious Offline/Online Shuffling and Sorting with Waksman Networks

    Get PDF
    As more privacy-preserving solutions leverage trusted execution environments (TEEs) like Intel SGX, it becomes pertinent that these solutions can by design thwart TEE side-channel attacks that research has brought to light. In particular, such solutions need to be fully oblivious to circumvent leaking private information through memory or timing side channels. In this work, we present fast fully oblivious algorithms for shuffling and sorting data. Oblivious shuffling and sorting are two fundamental primitives that are frequently used for permuting data in privacy-preserving solutions. We present novel oblivious shuffling and sorting algorithms in the offline/online model such that the bulk of the computation can be done in an offline phase that is independent of the data to be permuted. The resulting online phase provides performance improvements over state-of-the-art oblivious shuffling and sorting algorithms both asymptotically (O(βnlogn)O(\beta n\log n) vs. O(βnlog2n)O(\beta n \log^2n)) and concretely (>5×>5\times and >3×>3\times speedups), when permuting nn items each of size β\beta. Our work revisits Waksman networks, and it uses the key observation that setting the control bits of a Waksman network for a uniformly random shuffle is independent of the data to be shuffled. However, setting the control bits of a Waksman network efficiently and fully obliviously poses a challenge, and we provide a novel algorithm to this end. The total costs (inclusive of offline computation) of our WaksShuffle shuffling algorithm and our WaksSort sorting algorithm are lower than all other fully oblivious shuffling and sorting algorithms when the items are at least moderately sized (i.e., β\beta > 1400 B), and the performance gap only widens as the item sizes increase. Furthermore, WaksShuffle improves the online cost of oblivious shuffling by >5×>5\times for shuffling 2202^{20} items of any size; similarly WaksShuffle+QS, our other sorting algorithm, provides >2.7×>2.7\times speedups in the online cost of oblivious sorting

    Design Space Exploration of Sparsity-Aware Application-Specific Spiking Neural Network Accelerators

    Full text link
    Spiking Neural Networks (SNNs) offer a promising alternative to Artificial Neural Networks (ANNs) for deep learning applications, particularly in resource-constrained systems. This is largely due to their inherent sparsity, influenced by factors such as the input dataset, the length of the spike train, and the network topology. While a few prior works have demonstrated the advantages of incorporating sparsity into the hardware design, especially in terms of reducing energy consumption, the impact on hardware resources has not yet been explored. This is where design space exploration (DSE) becomes crucial, as it allows for the optimization of hardware performance by tailoring both the hardware and model parameters to suit specific application needs. However, DSE can be extremely challenging given the potentially large design space and the interplay of hardware architecture design choices and application-specific model parameters. In this paper, we propose a flexible hardware design that leverages the sparsity of SNNs to identify highly efficient, application-specific accelerator designs. We develop a high-level, cycle-accurate simulation framework for this hardware and demonstrate the framework's benefits in enabling detailed and fine-grained exploration of SNN design choices, such as the layer-wise logical-to-hardware ratio (LHR). Our experimental results show that our design can (i) achieve up to 76%76\% reduction in hardware resources and (ii) deliver a speed increase of up to 31.25×31.25\times, while requiring 27%27\% fewer hardware resources compared to sparsity-oblivious designs. We further showcase the robustness of our framework by varying spike train lengths with different neuron population sizes to find the optimal trade-off points between accuracy and hardware latency

    Towards trustworthy computing on untrustworthy hardware

    Get PDF
    Historically, hardware was thought to be inherently secure and trusted due to its obscurity and the isolated nature of its design and manufacturing. In the last two decades, however, hardware trust and security have emerged as pressing issues. Modern day hardware is surrounded by threats manifested mainly in undesired modifications by untrusted parties in its supply chain, unauthorized and pirated selling, injected faults, and system and microarchitectural level attacks. These threats, if realized, are expected to push hardware to abnormal and unexpected behaviour causing real-life damage and significantly undermining our trust in the electronic and computing systems we use in our daily lives and in safety critical applications. A large number of detective and preventive countermeasures have been proposed in literature. It is a fact, however, that our knowledge of potential consequences to real-life threats to hardware trust is lacking given the limited number of real-life reports and the plethora of ways in which hardware trust could be undermined. With this in mind, run-time monitoring of hardware combined with active mitigation of attacks, referred to as trustworthy computing on untrustworthy hardware, is proposed as the last line of defence. This last line of defence allows us to face the issue of live hardware mistrust rather than turning a blind eye to it or being helpless once it occurs. This thesis proposes three different frameworks towards trustworthy computing on untrustworthy hardware. The presented frameworks are adaptable to different applications, independent of the design of the monitored elements, based on autonomous security elements, and are computationally lightweight. The first framework is concerned with explicit violations and breaches of trust at run-time, with an untrustworthy on-chip communication interconnect presented as a potential offender. The framework is based on the guiding principles of component guarding, data tagging, and event verification. The second framework targets hardware elements with inherently variable and unpredictable operational latency and proposes a machine-learning based characterization of these latencies to infer undesired latency extensions or denial of service attacks. The framework is implemented on a DDR3 DRAM after showing its vulnerability to obscured latency extension attacks. The third framework studies the possibility of the deployment of untrustworthy hardware elements in the analog front end, and the consequent integrity issues that might arise at the analog-digital boundary of system on chips. The framework uses machine learning methods and the unique temporal and arithmetic features of signals at this boundary to monitor their integrity and assess their trust level
    corecore