Search CORE

11,641 research outputs found

Recommended from our members

Executing matrix multiply on a process oriented data flow machine

Author: Bic Lubomir
Nagel Mark D.
Roy John M.A.
Publication venue: eScholarship, University of California
Publication date: 01/01/1990
Field of study

The Process-Oriented Dataflow System (PODS) is an execution model that combines the von Neumann and dataflow models of computation to gain the benefits of each. Central to PODS is the concept of array distribution and its effects on partitioning and mapping of processes.In PODS arrays are partitioned by simply assigning consecutive elements to each processing element (PE) equally. Since PODS uses single assignment, there will be only one producer of each element. This producing PE owns that element and will perform the necessary computations to assign it. Using this approach the filling loop is distributed across the PEs. This simple partitioning and mapping scheme provides excellent results for executing scientific code on MIMD machines. In this way PODS allows MIMD machines to exploit vector and data parallelism easily while still providing the flexibility of MIMD over SIMD for multi-user systems.In this paper, the classic matrix multiply algorithm, with 1024 data points, is executed on a PODS simulator and the results are presented and discussed. Matrix multiply is a good example because it has several interesting properties: there are multiple code-blocks; a new array must be dynamically allocated and distributed; there is a loop-carried dependency in the innermost loop; the two input arrays have different access patterns; and the sizes of the input arrays are not known at compile time. Matrix multiply also forms the basis for many important scientific algorithms such as: LU decomposition, convolution, and the Fast-Fourier Transform.The results show that PODS is comparable to both Iannucci's Hybrid Architecture and MIT's TTDA in terms of overhead and instruction power. They also show that PODS easily distributes the work load evenly across the PEs. The key result is that PODS can scale matrix multiply in a near linear fashion until there is little or no work to be performed for each PE. Then overhead and message passing become a major component of the execution time. With larger problems (e.g., >/=16k data points) this limit would be reached at around 256 PEs

eScholarship - University of California

Code Generation for Efficient Query Processing in Managed Runtimes

Author: Bierman Gavin M.
Nagel Fabian
Viglas Stratis D.
Publication venue
Publication date: 01/01/2014
Field of study

In this paper we examine opportunities arising from the conver-gence of two trends in data management: in-memory database sys-tems (IMDBs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous ‘impedance mismatch ’ problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like IMDBs, but also to use the same querying language to query an application’s in-memory collections. The lat-ter offers further transparency to developers as the query language and all data is represented in the data model of the host program-ming language. However, compared to IMDBs, this additional free-dom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to im-prove query processing on application objects. We explore dif-ferent query compilation strategies and study how they improve the performance of query processing over application data. We take C] as the host programming language as it supports language-integrated query through the LINQ framework. Our techniques de-liver significant performance improvements over the default LINQ implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing language-integrated query and IMDB-inspired runtimes to provide transparent and highly efficient querying. 1

CiteSeerX

Crossref

Edinburgh Research Explorer

Recommended from our members

Automatic data/program partitioning using the single assignment principle

Author: Bic Lubomir
Nagel Mark D.
Roy John M.A.
Publication venue: eScholarship, University of California
Publication date: 01/01/1989
Field of study

Loosely-coupled MIMD architectures do not suffer from memory contention; hence large numbers of processors may be utilized. The main problem, however, is how to partition data and programs in order to exploit the available parallelism. In this paper we show that efficient schemes for automatic data/program partitioning and synchronization may be employed if single assignment is used. Using simulations of program loops common to scientific computations (the Livermore Loops), we demonstrate that only a small fraction of data accesses are remote and thus the degradation in network performance due to multiprocessing is minimal

eScholarship - University of California

Fundamentals of Traffic Flow

Author: A. D. May
C. Wagner
D. Helbing
D. Helbing
D. Helbing
D. Helbing
Dirk Helbing
K. Nagel
K. Nagel
M. Y. Choi
R. Kühne
S. Yukawa
T. Musha
X. Zhang
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/1997
Field of study

From single vehicle data a number of new empirical results concerning the density-dependence of the velocity distribution and its moments as well as the characteristics of their temporal fluctuations have been determined. These are utilized for the specification of some fundamental relations of traffic flow and compared with existing traffic theories.Comment: For related work see http://www.theo2.physik.uni-stuttgart.de/helbing.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref

Viscous to Inertial Crossover in Liquid Drop Coalescence

Author: Burton Justin C.
Nagel Sidney R.
Paulsen Joseph D.
Publication venue: 'American Physical Society (APS)'
Publication date: 15/03/2011
Field of study

Using an electrical method and high-speed imaging we probe drop coalescence down to 10 ns after the drops touch. By varying the liquid viscosity over two decades, we conclude that at sufficiently low approach velocity where deformation is not present, the drops coalesce with an unexpectedly late crossover time between a regime dominated by viscous and one dominated by inertial effects. We argue that the late crossover, not accounted for in the theory, can be explained by an appropriate choice of length-scales present in the flow geometry.Comment: 4 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Simulation for human factors research. A central question: Fidelity

Author: Nagel D.
Publication venue
Publication date: 01/03/1985
Field of study

Generalized outlines are presented for simulation in human factors research. Recent trends in aeronautical simulation are given. Some criteria for effective training devices are also given. Full system/full mission simulation in aviation and in space human factors research is presented

NASA Technical Reports Server

Open boundary conditions in stochastic transport processes with pair-factorized steady states

Author: Janke Wolfhard
Labavic D.
Meyer-Ortmanns Hildegard
Nagel Hannes
Publication venue: 'Elsevier BV'
Publication date: 31/12/2014
Field of study

Using numerical methods we discuss the effects of open boundary conditions on condensation phenomena in the zero-range process (ZRP) and transport processes with pair-factorized steady states (PFSS), an extended model of the ZRP with nearest-neighbor interaction. For the zero-range process we compare to analytical results in the literature with respect to criticality and condensation. For the extended model we find a similar phase structure, but observe supercritical phases with droplet formation for strong boundary drives.Comment: conference contribution for the 27th Annual CSP Workshop on "Recent Developments in Computer Simulation Studies in Condensed Matter Physics", CSP 2014 5 pages, 5 figure

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Multiple transient memories in sheared suspensions: robustness, structure, and routes to plasticity

Author: Keim Nathan C.
Nagel Sidney R.
Paulsen Joseph D.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2013
Field of study

Multiple transient memories, originally discovered in charge-density-wave conductors, are a remarkable and initially counterintuitive example of how a system can store information about its driving. In this class of memories, a system can learn multiple driving inputs, nearly all of which are eventually forgotten despite their continual input. If sufficient noise is present, the system regains plasticity so that it can continue to learn new memories indefinitely. Recently, Keim & Nagel showed how multiple transient memories could be generalized to a generic driven disordered system with noise, giving as an example simulations of a simple model of a sheared non-Brownian suspension. Here, we further explore simulation models of suspensions under cyclic shear, focussing on three main themes: robustness, structure, and overdriving. We show that multiple transient memories are a robust feature independent of many details of the model. The steady-state spatial distribution of the particles is sensitive to the driving algorithm; nonetheless, the memory formation is independent of such a change in particle correlations. Finally, we demonstrate that overdriving provides another means for controlling memory formation and retention

arXiv.org e-Print Archive

Crossref

Syracuse University Research Facility and Collaborative Environment

Two-lane traffic rules for cellular automata: A systematic approach

Author: B. Eisenblätter
B. Kerner
B. Kerner
B. Kerner
D. Chowdhury
D. Helbing
D. Helbing
D. Helbing
D. Helbing
D. Ktitarev
Dietrich E. Wolf
J. Krug
J. Treiterer
K. Nagel
K. Nagel
K. Nagel
K. Nagel
Kai Nagel
M. Bando
M. Cremer
M. Rickert
M. Sasvari
P. G. Gipps
P. Simon
P. Wagner
P. Wagner
Patrice Simon
Peter Wagner
S. Krauss
S. Krauss
T. Nagatani
T. Nagatani
T. Nagatani
Publication venue: 'American Physical Society (APS)'
Publication date: 05/11/1997
Field of study

Microscopic modeling of multi-lane traffic is usually done by applying heuristic lane changing rules, and often with unsatisfying results. Recently, a cellular automaton model for two-lane traffic was able to overcome some of these problems and to produce a correct density inversion at densities somewhat below the maximum flow density. In this paper, we summarize different approaches to lane changing and their results, and propose a general scheme, according to which realistic lane changing rules can be developed. We test this scheme by applying it to several different lane changing rules, which, in spite of their differences, generate similar and realistic results. We thus conclude that, for producing realistic results, the logical structure of the lane changing rules, as proposed here, is at least as important as the microscopic details of the rules

arXiv.org e-Print Archive

Crossref

UNT Digital Library