Search CORE

15,168 research outputs found

Deterministic, Stash-Free Write-Only ORAM

Author: Anderson Ross J.
Andrew
Aviv Adam J
Jia Yaoqi
Peters Timothy
Ren Ling
Stefanov Emil
Stefanov Emil
Toft Tomas
Zahur Samee
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/09/2017
Field of study

Write-Only Oblivious RAM (WoORAM) protocols provide privacy by encrypting the contents of data and also hiding the pattern of write operations over that data. WoORAMs provide better privacy than plain encryption and better performance than more general ORAM schemes (which hide both writing and reading access patterns), and the write-oblivious setting has been applied to important applications of cloud storage synchronization and encrypted hidden volumes. In this paper, we introduce an entirely new technique for Write-Only ORAM, called DetWoORAM. Unlike previous solutions, DetWoORAM uses a deterministic, sequential writing pattern without the need for any "stashing" of blocks in local state when writes fail. Our protocol, while conceptually simple, provides substantial improvement over prior solutions, both asymptotically and experimentally. In particular, under typical settings the DetWoORAM writes only 2 blocks (sequentially) to backend memory for each block written to the device, which is optimal. We have implemented our solution using the BUSE (block device in user-space) module and tested DetWoORAM against both an encryption only baseline of dm-crypt and prior, randomized WoORAM solutions, measuring only a 3x-14x slowdown compared to an encryption-only baseline and around 6x-19x speedup compared to prior work

arXiv.org e-Print Archive

Crossref

Cryptology ePrint Archive

Recommended from our members

Behavioral synthesis from VHDL using structured modeling

Author: Gajski Daniel D.
Lis Joseph S.
Publication venue: eScholarship, University of California
Publication date: 01/01/1991
Field of study

This dissertation describes work in behavioral synthesis involving the development of a VHDL Synthesis System VSS which accepts a VHDL behavioral input specification and performs technology independent synthesis to generate a circuit netlist of generic components. The VHDL language is used for input and output descriptions. An intermediate representation which incorporates signal typing and component attributes simplifies compilation and facilitates design optimization.A Structured Modeling methodology has been developed to suggest standard VHDL modeling practices for synthesis. Structured modeling provides recommendations for the use of available VHDL description styles so that optimal designs will be synthesized.A design composed of generic components is synthesized from the input description through a process of Graph Compilation, Graph Criticism, and Design Compilation. Experiments were performed to demonstrate the effects of different modeling styles on the quality of the design produced by VSS. Several alternative VHDL models were examined for each benchmark, illustrating the improvements in design quality achieved when Structured Modeling guidelines were followed

eScholarship - University of California

Optimization guide for programs compiled under IBM FORTRAN H (OPT=2)

Author: Dobyns A. H.
Marsh H. M.
Smith D. M.
Publication venue
Publication date
Field of study

Guidelines are given to provide the programmer with various techniques for optimizing programs when the FORTRAN IV H compiler is used with OPT=2. Subroutines and programs are described in the appendices along with a timing summary of all the examples given in the manual

NASA Technical Reports Server

Communication Complexity and Secure Function Evaluation

Author: Naor Moni
Nissim Kobbi
Publication venue
Publication date: 01/01/2001
Field of study

We suggest two new methodologies for the design of efficient secure protocols, that differ with respect to their underlying computational models. In one methodology we utilize the communication complexity tree (or branching for f and transform it into a secure protocol. In other words, "any function f that can be computed using communication complexity c can be can be computed securely using communication complexity that is polynomial in c and a security parameter". The second methodology uses the circuit computing f, enhanced with look-up tables as its underlying computational model. It is possible to simulate any RAM machine in this model with polylogarithmic blowup. Hence it is possible to start with a computation of f on a RAM machine and transform it into a secure protocol. We show many applications of these new methodologies resulting in protocols efficient either in communication or in computation. In particular, we exemplify a protocol for the "millionaires problem", where two participants want to compare their values but reveal no other information. Our protocol is more efficient than previously known ones in either communication or computation

arXiv.org e-Print Archive

CiteSeerX

Cryptology ePrint Archive

Spherical harmonic transform with GPUs

Author: Falcou Joel
Grigori Laura
Hupca Ioan O.
Stompor Radek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/10/2010
Field of study

We describe an algorithm for computing an inverse spherical harmonic transform suitable for graphic processing units (GPU). We use CUDA and base our implementation on a Fortran90 routine included in a publicly available parallel package, S2HAT. We focus our attention on the two major sequential steps involved in the transforms computation, retaining the efficient parallel framework of the original code. We detail optimization techniques used to enhance the performance of the CUDA-based code and contrast them with those implemented in the Fortran90 version. We also present performance comparisons of a single CPU plus GPU unit with the S2HAT code running on either a single or 4 processors. In particular we find that use of the latest generation of GPUs, such as NVIDIA GF100 (Fermi), can accelerate the spherical harmonic transforms by as much as 18 times with respect to S2HAT executed on one core, and by as much as 5.5 with respect to S2HAT on 4 cores, with the overall performance being limited by the Fast Fourier transforms. The work presented here has been performed in the context of the Cosmic Microwave Background simulations and analysis. However, we expect that the developed software will be of more general interest and applicability

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Taming Numbers and Durations in the Model Checking Integrated Planning System

Author: Edelkamp S.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

The Model Checking Integrated Planning System (MIPS) is a temporal least commitment heuristic search planner based on a flexible object-oriented workbench architecture. Its design clearly separates explicit and symbolic directed exploration algorithms from the set of on-line and off-line computed estimates and associated data structures. MIPS has shown distinguished performance in the last two international planning competitions. In the last event the description language was extended from pure propositional planning to include numerical state variables, action durations, and plan quality objective functions. Plans were no longer sequences of actions but time-stamped schedules. As a participant of the fully automated track of the competition, MIPS has proven to be a general system; in each track and every benchmark domain it efficiently computed plans of remarkable quality. This article introduces and analyzes the most important algorithmic novelties that were necessary to tackle the new layers of expressiveness in the benchmark problems and to achieve a high level of performance. The extensions include critical path analysis of sequentially generated plans to generate corresponding optimal parallel plans. The linear time algorithm to compute the parallel plan bypasses known NP hardness results for partial ordering by scheduling plans with respect to the set of actions and the imposed precedence relations. The efficiency of this algorithm also allows us to improve the exploration guidance: for each encountered planning state the corresponding approximate sequential plan is scheduled. One major strength of MIPS is its static analysis phase that grounds and simplifies parameterized predicates, functions and operators, that infers knowledge to minimize the state description length, and that detects domain object symmetries. The latter aspect is analyzed in detail. MIPS has been developed to serve as a complete and optimal state space planner, with admissible estimates, exploration engines and branching cuts. In the competition version, however, certain performance compromises had to be made, including floating point arithmetic, weighted heuristic search exploration according to an inadmissible estimate and parameterized optimization

arXiv.org e-Print Archive

Crossref

AirIndex: Versatile Index Tuning Through Data and Storage

Author: Chockchowwat Supawit
Liu Wenjie
Park Yongjoo
Publication venue
Publication date: 20/07/2023
Field of study

The end-to-end lookup latency of a hierarchical index -- such as a B-tree or a learned index -- is determined by its structure such as the number of layers, the kinds of branching functions appearing in each layer, the amount of data we must fetch from layers, etc. Our primary observation is that by optimizing those structural parameters (or designs) specifically to a target system's I/O characteristics (e.g., latency, bandwidth), we can offer a faster lookup compared to the ones that are not optimized. Can we develop a systematic method for finding those optimal design parameters? Ideally, the method must have the potential to generate almost any existing index or a novel combination of them for the fastest possible lookup. In this work, we present new data and an I/O-aware index builder (called AirIndex) that can find high-speed hierarchical index designs in a principled way. Specifically, AirIndex minimizes an objective function expressing the end-to-end latency in terms of various designs -- the number of layers, types of layers, and more -- for given data and a storage profile, using a graph-based optimization method purpose-built to address the computational challenges rising from the inter-dependencies among index layers and the exponentially many candidate parameters in a large search space. Our empirical studies confirm that AirIndex can find optimal index designs, build optimal indexes within the times comparable to existing methods, and deliver up to 4.1x faster lookup than a lightweight B-tree library (LMDB), 3.3x--46.3x faster than state-of-the-art learned indexes (RMI/CDFShop, PGM-Index, ALEX/APEX, PLEX), and 2.0 faster than Data Calculator's suggestion on various dataset and storage settings.Comment: 13 pages, 3 appendices, 19 figures, to appear at SIGMOD 202

arXiv.org e-Print Archive

A Comparative Xeon and CBE Performance Analysis

Author: Fort Randy
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2008
Field of study

The Cell Broadband Engine is a high performance multicore processor with superb performance on certain types of problems. However, it does not perform as well running other algorithms, particularly those with heavy branching. The Intel Xeon processor is a high performance superscalar processor. It utilizes a high clock speed and deep pipelines to help it achieve superior performance. But deep pipelines can perform poorly with frequent memory accesses. This paper is a study and attempt at quantifying the types of programmatic structures that are more suitable to a particular architecture. It focuses on the issues of pipelines, memory access and branching on these two microprocessor architectures

SJSU ScholarWorks