353 research outputs found

    PAC Security: Automatic Privacy Measurement and Control of Data Processing

    Full text link
    We propose and study a new privacy definition, termed Probably Approximately Correct (PAC) Security. PAC security characterizes the information-theoretic hardness to recover sensitive data given arbitrary information disclosure/leakage during/after any processing. Unlike the classic cryptographic definition and Differential Privacy (DP), which consider the adversarial (input-independent) worst case}, PAC security is a simulatable metric that accommodates priors and quantifies the instance-based impossibility of inference. A fully automatic analysis and proof generation framework is proposed, where security parameters can be produced with arbitrarily high confidence via Monte-Carlo simulation for any black-box data processing oracle. This appealing automation property enables analysis of complicated data processing, where the worst-case proof in the classic privacy regime could be loose or even intractable. Furthermore, we show that the magnitude of (necessary) perturbation required in PAC security is not explicitly dependent on dimensionality, which is in contrast to the worst-case information-theoretic lower bound. We also include practical applications of PAC security with comparisons

    Low-overhead hard real-time aware interconnect network router

    Get PDF
    The increasing complexity of embedded systems is accelerating the use of multicore processors in these systems. This trend gives rise to new problems such as the sharing of on-chip network resources among hard real-time and normal best effort data traffic. We propose a network-on-chip router that provides predictable and deterministic communication latency for hard real-time data traffic while maintaining high concurrency and throughput for best-effort/general-purpose traffic with minimal hardware overhead. The proposed router requires less area than non-interfering networks, and provides better Quality of Service (QoS) in terms of predictability and determinism to hard real-time traffic than priority-based routers. We present a deadlock-free algorithm for decoupled routing of the two types of traffic. We compare the area and power estimates of three different router architectures with various QoS schemes using the IBM 45-nm SOI CMOS technology cell library. Performance evaluations are done using three realistic benchmark applications: a hybrid electric vehicle application, a utility grid connected photovoltaic converter system, and a variable speed induction motor drive application

    Algorithms for scheduling task-based applications onto heterogeneous many-core architectures

    Get PDF
    In this paper we present an Integer Linear Programming (ILP) formulation and two non-iterative heuristics for scheduling a task-based application onto a heterogeneous many-core architecture. Our ILP formulation is able to handle different application performance targets, e.g., low execution time, low memory miss rate, and different architectural features, e.g., cache sizes. For large size problem where the ILP convergence time may be too long, we propose a simple mapping algorithm which tries to spread tasks onto as many processing units as possible, and a more elaborate heuristic that shows good mapping performance when compared to the ILP formulation. We use two realistic power electronics applications to evaluate our mapping techniques on full RTL many-core systems consisting of eight different types of processor cores

    Security challenges and opportunities in adaptive and reconfigurable hardware

    Get PDF
    We present a novel approach to building hardware support for providing strong security guarantees for computations running in the cloud (shared hardware in massive data centers), while maintaining the high performance and low cost that make cloud computing attractive in the first place. We propose augmenting regular cloud servers with a Trusted Computation Base (TCB) that can securely perform high-performance computations. Our TCB achieves cost savings by spreading functionality across two paired chips. We show that making a Field-Programmable Gate Array (FPGA) a part of the TCB benefits security and performance, and we explore a new method for defending the computation inside the TCB against side-channel attacks.Northrop Grumman CorporationQuanta Computer (Firm

    Efficient Synchronous Byzantine Consensus

    Get PDF
    We present new protocols for Byzantine state machine replication and Byzantine agreement in the synchronous and authenticated setting. The celebrated PBFT state machine replication protocol tolerates ff Byzantine faults in an asynchronous setting using 3f+13f+1 replicas, and has since been studied or deployed by numerous works. In this work, we improve the Byzantine fault tolerance threshold to n=2f+1n=2f+1 by utilizing a relaxed synchrony assumption. We present a synchronous state machine replication protocol that commits a decision every 3 rounds in the common case. The key challenge is to ensure quorum intersection at one honest replica. Our solution is to rely on the synchrony assumption to form a post-commit quorum of size 2f+12f+1, which intersects at f+1f+1 replicas with any pre-commit quorums of size f+1f+1. Our protocol also solves synchronous authenticated Byzantine agreement in expected 8 rounds. The best previous solution (Katz and Koo, 2006) requires expected 24 rounds. Our protocols may be applied to build Byzantine fault tolerant systems or improve cryptographic protocols such as cryptocurrencies when synchrony can be assumed

    A Case for Fine-Grain Adaptive Cache Coherence

    Get PDF
    As transistor density continues to grow geometrically, processor manufacturers are already able to place a hundred cores on a chip (e.g., Tilera TILE-Gx 100), with massive multicore chips on the horizon. Programmers now need to invest more effort in designing software capable of exploiting multicore parallelism. The shared memory paradigm provides a convenient layer of abstraction to the programmer, but will current memory architectures scale to hundreds of cores? This paper directly addresses the question of how to enable scalable memory systems for future multicores. We develop a scalable, efficient shared memory architecture that enables seamless adaptation between private and logically shared caching at the fine granularity of cache lines. Our data-centric approach relies on in hardware runtime profiling of the locality of each cache line and only allows private caching for data blocks with high spatio-temporal locality. This allows us to better exploit on-chip cache capacity and enable low-latency memory access in large-scale multicores
    • …
    corecore