Search CORE

5 research outputs found

Parallel Out-of-Core Sorting: The Third Way

Author: Chaudhry Geeta
Publication venue: Dartmouth Digital Commons
Publication date: 12/03/2004
Field of study

Sorting very large datasets is a key subroutine in almost any application that is built on top of a large database. Two ways to sort out-of-core data dominate the literature: merging-based algorithms and partitioning-based algorithms. Within these two paradigms, all the programs that sort out-of-core data on a cluster rely on assumptions about the input distribution. We propose a third way of out-of-core sorting: oblivious algorithms. In all, we have developed six programs that sort out-of-core data on a cluster. The first three programs, based completely on Leighton\u27s columnsort algorithm, have a restriction on the maximum problem size that they can sort. The other three programs relax this restriction; two are based on our original algorithmic extensions to columnsort. We present experimental results to show that our algorithms perform well. To the best of our knowledge, the programs presented in this thesis are the first to sort out-of-core data on a cluster without making any simplifying assumptions about the distribution of the data to be sorted

Dartmouth Digital Commons (Dartmouth College)

Prochlo: Strong Privacy for Analytics in the Crowd

Author: Abadi M.
Abadi M.
Abadi M.
Avent B.
Bellare M.
Bulck J. V.
Buse R. P. L.
Chen R.
Corrigan-Gibbs H.
Dang H.
Denning D. E. R.
Dinh T. T. A.
Dwork
Lee S.
Maniatis P.
Ohrimenko O.
Ravindranath L.
Roy I.
Saltzer J. H.
Viega J.
Wang T.
Warner
Zheng W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/10/2017
Field of study

The large-scale monitoring of computer users' software activities has become commonplace, e.g., for application telemetry, error reporting, or demographic profiling. This paper describes a principled systems architecture---Encode, Shuffle, Analyze (ESA)---for performing such monitoring with high utility while also protecting user privacy. The ESA design, and its Prochlo implementation, are informed by our practical experiences with an existing, large deployment of privacy-preserving software monitoring. (cont.; see the paper

arXiv.org e-Print Archive

University of Toronto Research Repository

Crossref

Department of Computer Science Activity 1998-2004

Author: Kotz David
Publication venue: Dartmouth Digital Commons
Publication date: 20/03/2005
Field of study

This report summarizes much of the research and teaching activity of the Department of Computer Science at Dartmouth College between late 1998 and late 2004. The material for this report was collected as part of the final report for NSF Institutional Infrastructure award EIA-9802068, which funded equipment and technical staff during that six-year period. This equipment and staff supported essentially all of the department\u27s research activity during that period

Dartmouth Digital Commons (Dartmouth College)

LIPIcs, Volume 251, ITCS 2023, Complete Volume

Author: Tauman Kalai Yael
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 251, ITCS 2023, Complete Volum

Dagstuhl Research Online Publication Server

Relaxing the problem-size bound for out-of-core columnsort

Author: Elizabeth A. Hamon
Geeta Chaudhry
Thomas H. Cormen
Publication venue
Publication date: 01/01/2003
Field of study

Previous implementations of out-of-core columnsort limit the problem size to N ≤ � (M/P) 3 /2, where N is the number of records to sort, P is the number of processors, and M is the total number of records that the entire system can hold in its memory (so that M/P is the number of records that a single processor can hold in its memory). We implemented two variations to out-of-core columnsort that relax this restriction. Subblock columnsort is based on an algorithmic modification of the underlying columnsort algorithm, and it improves the problem-size bound to N ≤ (M/P) 5/3 /4 2/3 but at the cost of additional disk I/O. M-columnsort changes the notion of the column size in columnsort, improving the maximum problem size to N ≤ � M 3 /2 but at the cost of additional computation and communication. Experimental results on a Beowulf cluster show that both subblock columnsort and M-columnsort run well but that M-columnsort is faster. A further advantage of M-columnsort is that it handles a wider range of problem sizes than subblock columnsort. This research was supported in part by NSF Grant EIA-98-02068.

CiteSeerX

Dartmouth Digital Commons (Dartmouth College)