3,475 research outputs found
Introduction to a system for implementing neural net connections on SIMD architectures
Neural networks have attracted much interest recently, and using parallel architectures to simulate neural networks is a natural and necessary application. The SIMD model of parallel computation is chosen, because systems of this type can be built with large numbers of processing elements. However, such systems are not naturally suited to generalized communication. A method is proposed that allows an implementation of neural network connections on massively parallel SIMD architectures. The key to this system is an algorithm permitting the formation of arbitrary connections between the neurons. A feature is the ability to add new connections quickly. It also has error recovery ability and is robust over a variety of network topologies. Simulations of the general connection system, and its implementation on the Connection Machine, indicate that the time and space requirements are proportional to the product of the average number of connections per neuron and the diameter of the interconnection network
A system for routing arbitrary directed graphs on SIMD architectures
There are many problems which can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from connecting vertices. A method is given for parallelizing such problems on an SIMD machine model that is bit-serial and uses only nearest neighbor connections for communication. Each vertex of the graph will be assigned to a processor in the machine. Algorithms are given that will be used to implement movement of data along the arcs of the graph. This architecture and algorithms define a system that is relatively simple to build and can do graph processing. All arcs can be transversed in parallel in time O(T), where T is empirically proportional to the diameter of the interconnection network times the average degree of the graph. Modifying or adding a new arc takes the same time as parallel traversal
An Enhanced Multiway Sorting Network Based on n-Sorters
Merging-based sorting networks are an important family of sorting networks.
Most merge sorting networks are based on 2-way or multi-way merging algorithms
using 2-sorters as basic building blocks. An alternative is to use n-sorters,
instead of 2-sorters, as the basic building blocks so as to greatly reduce the
number of sorters as well as the latency. Based on a modified Leighton's
columnsort algorithm, an n-way merging algorithm, referred to as SS-Mk, that
uses n-sorters as basic building blocks was proposed. In this work, we first
propose a new multiway merging algorithm with n-sorters as basic building
blocks that merges n sorted lists of m values each in 1 + ceil(m/2) stages (n
<= m). Based on our merging algorithm, we also propose a sorting algorithm,
which requires O(N log2 N) basic sorters to sort N inputs. While the asymptotic
complexity (in terms of the required number of sorters) of our sorting
algorithm is the same as the SS-Mk, for wide ranges of N, our algorithm
requires fewer sorters than the SS-Mk. Finally, we consider a binary sorting
network, where the basic sorter is implemented in threshold logic and scales
linearly with the number of inputs, and compare the complexity in terms of the
required number of gates. For wide ranges of N, our algorithm requires fewer
gates than the SS-Mk.Comment: 13 pages, 14 figure
Smooth heaps and a dual view of self-adjusting data structures
We present a new connection between self-adjusting binary search trees (BSTs)
and heaps, two fundamental, extensively studied, and practically relevant
families of data structures. Roughly speaking, we map an arbitrary heap
algorithm within a natural model, to a corresponding BST algorithm with the
same cost on a dual sequence of operations (i.e. the same sequence with the
roles of time and key-space switched). This is the first general transformation
between the two families of data structures.
There is a rich theory of dynamic optimality for BSTs (i.e. the theory of
competitiveness between BST algorithms). The lack of an analogous theory for
heaps has been noted in the literature. Through our connection, we transfer all
instance-specific lower bounds known for BSTs to a general model of heaps,
initiating a theory of dynamic optimality for heaps.
On the algorithmic side, we obtain a new, simple and efficient heap
algorithm, which we call the smooth heap. We show the smooth heap to be the
heap-counterpart of Greedy, the BST algorithm with the strongest proven and
conjectured properties from the literature, widely believed to be
instance-optimal. Assuming the optimality of Greedy, the smooth heap is also
optimal within our model of heap algorithms. As corollaries of results known
for Greedy, we obtain instance-specific upper bounds for the smooth heap, with
applications in adaptive sorting.
Intriguingly, the smooth heap, although derived from a non-practical BST
algorithm, is simple and easy to implement (e.g. it stores no auxiliary data
besides the keys and tree pointers). It can be seen as a variation on the
popular pairing heap data structure, extending it with a "power-of-two-choices"
type of heuristic.Comment: Presented at STOC 2018, light revision, additional figure
Deterministic 1-k routing on meshes with applications to worm-hole routing
In - routing each of the processing units of an mesh connected computer initially holds packet which must be routed such that any processor is the destination of at most packets. This problem reflects practical desire for routing better than the popular routing of permutations. - routing also has implications for hot-potato worm-hole routing, which is of great importance for real world systems. We present a near-optimal deterministic algorithm running in \sqrt{k} \cdot n / 2 + \go{n} steps. We give a second algorithm with slightly worse routing time but working queue size three. Applying this algorithm considerably reduces the routing time of hot-potato worm-hole routing. Non-trivial extensions are given to the general - routing problem and for routing on higher dimensional meshes. Finally we show that - routing can be performed in \go{k \cdot n} steps with working queue size four. Hereby the hot-potato worm-hole routing problem can be solved in \go{k^{3/2} \cdot n} steps
- …