8,310 research outputs found
Faster 64-bit universal hashing using carry-less multiplications
Intel and AMD support the Carry-less Multiplication (CLMUL) instruction set
in their x64 processors. We use CLMUL to implement an almost universal 64-bit
hash family (CLHASH). We compare this new family with what might be the fastest
almost universal family on x64 processors (VHASH). We find that CLHASH is at
least 60% faster. We also compare CLHASH with a popular hash function designed
for speed (Google's CityHash). We find that CLHASH is 40% faster than CityHash
on inputs larger than 64 bytes and just as fast otherwise
No elliptic islands for the universal area-preserving map
A renormalization approach has been used in \cite{EKW1} and \cite{EKW2} to
prove the existence of a \textit{universal area-preserving map}, a map with
hyperbolic orbits of all binary periods. The existence of a horseshoe, with
positive Hausdorff dimension, in its domain was demonstrated in \cite{GJ1}. In
this paper the coexistence problem is studied, and a computer-aided proof is
given that no elliptic islands with period less than 20 exist in the domain. It
is also shown that less than 1.5% of the measure of the domain consists of
elliptic islands. This is proven by showing that the measure of initial
conditions that escape to infinity is at least 98.5% of the measure of the
domain, and we conjecture that the escaping set has full measure. This is
highly unexpected, since generically it is believed that for conservative
systems hyperbolicity and ellipticity coexist
Expanded delta networks for very large parallel computers
In this paper we analyze a generalization of the traditional delta network, introduced by Patel [21], and dubbed Expanded Delta Network (EDN). These networks provide in general multiple paths that can be exploited to reduce contention in the network resulting in increased performance. The crossbar and traditional delta networks are limiting cases of this class of networks. However, the delta network does not provide the multiple paths that the more general expanded delta networks provide, and crossbars are to costly to use for large networks. The EDNs are analyzed with respect to their routing capabilities in the MIMD and SIMD models of computation.The concepts of capacity and clustering are also addressed. In massively parallel SIMD computers, it is the trend to put a larger number processors on a chip, but due to I/O constraints only a subset of the total number of processors may have access to the network. This is introduced as a Restricted Access Expanded Delta Network of which the MasPar MP-1 router network is an example
Parallel Working-Set Search Structures
In this paper we present two versions of a parallel working-set map on p
processors that supports searches, insertions and deletions. In both versions,
the total work of all operations when the map has size at least p is bounded by
the working-set bound, i.e., the cost of an item depends on how recently it was
accessed (for some linearization): accessing an item in the map with recency r
takes O(1+log r) work. In the simpler version each map operation has O((log
p)^2+log n) span (where n is the maximum size of the map). In the pipelined
version each map operation on an item with recency r has O((log p)^2+log r)
span. (Operations in parallel may have overlapping span; span is additive only
for operations in sequence.)
Both data structures are designed to be used by a dynamic multithreading
parallel program that at each step executes a unit-time instruction or makes a
data structure call. To achieve the stated bounds, the pipelined data structure
requires a weak-priority scheduler, which supports a limited form of 2-level
prioritization. At the end we explain how the results translate to practical
implementations using work-stealing schedulers.
To the best of our knowledge, this is the first parallel implementation of a
self-adjusting search structure where the cost of an operation adapts to the
access sequence. A corollary of the working-set bound is that it achieves work
static optimality: the total work is bounded by the access costs in an optimal
static search tree.Comment: Authors' version of a paper accepted to SPAA 201
Free and regular mixed-model sequences by a linear program-assisted hybrid algorithm GRASP-LP
A linear program-assisted hybrid algorithm (GRASP-LP) is presented to solve a mixed-model sequencing problem in an assembly line. The issue of the problem is to obtain manufacturing sequences of product models with the minimum work overload, allowing the free interruption of operations at workstations and preserving the production mix. The implemented GRASP-LP is compared with other procedures through a case study linked with the Nissan’ Engine Plant from Barcelona.Peer ReviewedPostprint (author's final draft
Recommended from our members
Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures
The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu
- …