1,783 research outputs found
Noisy Sorting Without Searching: Data Oblivious Sorting with Comparison Errors
We provide and study several algorithms for sorting an array of n comparable distinct elements subject to probabilistic comparison errors. In this model, the comparison of two elements returns the wrong answer according to a fixed probability, p_e < 1/2, and otherwise returns the correct answer. The dislocation of an element is the distance between its position in a given (current or output) array and its position in a sorted array. There are various algorithms that can be utilized for sorting or near-sorting elements subject to probabilistic comparison errors, but these algorithms are not data oblivious because they all make heavy use of noisy binary searching. In this paper, we provide new methods for sorting with comparison errors that are data oblivious while avoiding the use of noisy binary search methods. In addition, we experimentally compare our algorithms and other sorting algorithms
A Back-to-Basics Empirical Study of Priority Queues
The theory community has proposed several new heap variants in the recent
past which have remained largely untested experimentally. We take the field
back to the drawing board, with straightforward implementations of both classic
and novel structures using only standard, well-known optimizations. We study
the behavior of each structure on a variety of inputs, including artificial
workloads, workloads generated by running algorithms on real map data, and
workloads from a discrete event simulator used in recent systems networking
research. We provide observations about which characteristics are most
correlated to performance. For example, we find that the L1 cache miss rate
appears to be strongly correlated with wallclock time. We also provide
observations about how the input sequence affects the relative performance of
the different heap variants. For example, we show (both theoretically and in
practice) that certain random insertion-deletion sequences are degenerate and
can lead to misleading results. Overall, our findings suggest that while the
conventional wisdom holds in some cases, it is sorely mistaken in others
Efficient Parallel and Distributed Algorithms for GIS Polygon Overlay Processing
Polygon clipping is one of the complex operations in computational geometry. It is used in Geographic Information Systems (GIS), Computer Graphics, and VLSI CAD. For two polygons with n and m vertices, the number of intersections can be O(nm). In this dissertation, we present the first output-sensitive CREW PRAM algorithm, which can perform polygon clipping in O(log n) time using O(n + k + k\u27) processors, where n is the number of vertices, k is the number of intersections, and k\u27 is the additional temporary vertices introduced due to the partitioning of polygons. The current best algorithm by Karinthi, Srinivas, and Almasi does not handle self-intersecting polygons, is not output-sensitive and must employ O(n^2) processors to achieve O(log n) time. The second parallel algorithm is an output-sensitive PRAM algorithm based on Greiner-Hormann algorithm with O(log n) time complexity using O(n + k) processors. This is cost-optimal when compared to the time complexity of the best-known sequential plane-sweep based algorithm for polygon clipping. For self-intersecting polygons, the time complexity is O(((n + k) log n log log n)/p) using p
In addition to these parallel algorithms, the other main contributions in this dissertation are 1) multi-core and many-core implementation for clipping a pair of polygons and 2) MPI-GIS and Hadoop Topology Suite for distributed polygon overlay using a cluster of nodes. Nvidia GPU and CUDA are used for the many-core implementation. The MPI based system achieves 44X speedup while processing about 600K polygons in two real-world GIS shapefiles 1) USA Detailed Water Bodies and 2) USA Block Group Boundaries) within 20 seconds on a 32-node (8 cores each) IBM iDataPlex cluster interconnected by InfiniBand technology
Everything Matters in Programmable Packet Scheduling
Programmable packet scheduling allows the deployment of scheduling algorithms
into existing switches without need for hardware redesign. Scheduling
algorithms are programmed by tagging packets with ranks, indicating their
desired priority. Programmable schedulers then execute these algorithms by
serving packets in the order described in their ranks.
The ideal programmable scheduler is a Push-In First-Out (PIFO) queue, which
achieves perfect packet sorting by pushing packets into arbitrary positions in
the queue, while only draining packets from the head. Unfortunately,
implementing PIFO queues in hardware is challenging due to the need to
arbitrarily sort packets at line rate based on their ranks.
In the last years, various techniques have been proposed, approximating PIFO
behaviors using the available resources of existing data planes. While
promising, approaches to date only approximate one of the characteristic
behaviors of PIFO queues (i.e., its scheduling behavior, or its admission
control).
We propose PACKS, the first programmable scheduler that fully approximates
PIFO queues on all their behaviors. PACKS does so by smartly using a set of
strict-priority queues. It uses packet-rank information and queue-occupancy
levels at enqueue to decide: whether to admit packets to the scheduler, and how
to map admitted packets to the different queues.
We fully implement PACKS in P4 and evaluate it on real workloads. We show
that PACKS: better-approximates PIFO than state-of-the-art approaches and
scales. We also show that PACKS runs at line rate on existing hardware (Intel
Tofino).Comment: 12 pages, 12 figures (without references and appendices
Ocean color modeling: Parameterization and interpretation
The ocean color as observed near the water surface is determined mainly by dissolved and particulate substances, known as optically-active constituents, in the upper water column. The goal of ocean color modeling is to interpret an ocean color spectrum quantitatively to estimate the suite of optically-active constituents near the surface. In recent years, ocean color modeling efforts have been centering upon three major optically-active constituents: chlorophyll concentration, colored dissolved organic matter, and scattering particulates. Many challenges are still being faced in this arena. This thesis generally addresses and deals with some critical issues in ocean color modeling.
In chapter one, an extensive literature survey on ocean color modeling is given. A general ocean color model is presented to identify critical candidate uncertainty sources in modeling the ocean color. The goal for this thesis study is then defined as well as some specific objectives. Finally, a general overview of the dissertation is portrayed, defining each of the follow-up chapters to target some relevant objectives.
In chapter two, a general approach is presented to quantify constituent concentration retrieval errors induced by uncertainties in inherent optical property (IOP) submodels of a semi-analytical forward model. Chlorophyll concentrations are retrieved by inverting a forward model with nonlinear IOPs. The study demonstrates how uncertainties in individual IOP submodels influence the accuracy of the chlorophyll concentration retrieval at different chlorophyll concentration levels. The important finding for this study shows that precise knowledge of spectral shapes of IOP submodels is critical for accurate chlorophyll retrieval, suggesting an improvement in retrieval accuracy requires precise spectral IOP measurements.
In chapter three, three distinct inversion techniques, namely, nonlinear optimization (NLO), principal component analysis (PCA) and artificial neural network (ANN) are compared to assess their inversion performances to retrieve optically-active constituents for a complex nonlinear bio-optical system simulated by a semi-analytical ocean color model. A well-designed simulation scheme was implemented to simulate waters of different bio-optical complexity, and then the three inversion methods were applied to these simulated datasets for performance evaluation.
In chapter four, an approach is presented for optimally parameterizing an irradiance reflectance model on the basis of a bio-optical dataset made at 45 stations in the Tokyo Bay and nearby regions between 1982 and 1984. (Abstract shortened by UMI.)
- …