7,649 research outputs found
One Dimensional Optimization on Multiprocessor Systems
This paper presents a straightforward approach to determining how best to utilize an MIMD multiprocessor in the solution of one dimensional optimization problems involving continuous unimodal functions and nongradient search techniques. A methodology is presented which allows one to consider a variety of speedup functions which may occur in parallel function and systems evaluation. It is shown how the best of two parallel optimization strategies can be determined for a given accuracy, number of processors and speedup function
Energy-Efficient Scheduling for Homogeneous Multiprocessor Systems
We present a number of novel algorithms, based on mathematical optimization
formulations, in order to solve a homogeneous multiprocessor scheduling
problem, while minimizing the total energy consumption. In particular, for a
system with a discrete speed set, we propose solving a tractable linear
program. Our formulations are based on a fluid model and a global scheduling
scheme, i.e. tasks are allowed to migrate between processors. The new methods
are compared with three global energy/feasibility optimal workload allocation
formulations. Simulation results illustrate that our methods achieve both
feasibility and energy optimality and outperform existing methods for
constrained deadline tasksets. Specifically, the results provided by our
algorithm can achieve up to an 80% saving compared to an algorithm without a
frequency scaling scheme and up to 70% saving compared to a constant frequency
scaling scheme for some simulated tasksets. Another benefit is that our
algorithms can solve the scheduling problem in one step instead of using a
recursive scheme. Moreover, our formulations can solve a more general class of
scheduling problems, i.e. any periodic real-time taskset with arbitrary
deadline. Lastly, our algorithms can be applied to both online and offline
scheduling schemes.Comment: Corrected typos: definition of J_i in Section 2.1; (3b)-(3c);
definition of \Phi_A and \Phi_D in paragraph after (6b). Previous equations
were correct only for special case of p_i=d_
Radiation safety based on the sky shine effect in reactor
In the reactor operation, neutrons and gamma rays are the most dominant radiation.
As protection, lead and concrete shields are built around the reactor. However, the radiation
can penetrate the water shielding inside the reactor pool. This incident leads to the occurrence
of sky shine where a physical phenomenon of nuclear radiation sources was transmitted
panoramic that extends to the environment. The effect of this phenomenon is caused by the
fallout radiation into the surrounding area which causes the radiation dose to increase. High
doses of exposure cause a person to have stochastic effects or deterministic effects. Therefore,
this study was conducted to measure the radiation dose from sky shine effect that scattered
around the reactor at different distances and different height above the reactor platform. In this
paper, the analysis of the radiation dose of sky shine effect was measured using the
experimental metho
Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling
Though the GPGPU concept is well-known
in image processing, much more work remains to be done
to fully exploit GPUs as an alternative computation
engine. This paper investigates the computation-to-core
mapping strategies to probe the efficiency and scalability
of the robust facet image modeling algorithm on GPUs.
Our fine-grained computation-to-core mapping scheme
shows a significant performance gain over the standard
pixel-wise mapping scheme. With in-depth performance
comparisons across the two different mapping schemes,
we analyze the impact of the level of parallelism on
the GPU computation and suggest two principles for
optimizing future image processing applications on the
GPU platform
Optimization of Discrete-parameter Multiprocessor Systems using a Novel Ergodic Interpolation Technique
Modern multi-core systems have a large number of design parameters, most of
which are discrete-valued, and this number is likely to keep increasing as chip
complexity rises. Further, the accurate evaluation of a potential design choice
is computationally expensive because it requires detailed cycle-accurate system
simulation. If the discrete parameter space can be embedded into a larger
continuous parameter space, then continuous space techniques can, in principle,
be applied to the system optimization problem. Such continuous space techniques
often scale well with the number of parameters.
We propose a novel technique for embedding the discrete parameter space into
an extended continuous space so that continuous space techniques can be applied
to the embedded problem using cycle accurate simulation for evaluating the
objective function. This embedding is implemented using simulation-based
ergodic interpolation, which, unlike spatial interpolation, produces the
interpolated value within a single simulation run irrespective of the number of
parameters. We have implemented this interpolation scheme in a cycle-based
system simulator. In a characterization study, we observe that the interpolated
performance curves are continuous, piece-wise smooth, and have low statistical
error. We use the ergodic interpolation-based approach to solve a large
multi-core design optimization problem with 31 design parameters. Our results
indicate that continuous space optimization using ergodic interpolation-based
embedding can be a viable approach for large multi-core design optimization
problems.Comment: A short version of this paper will be published in the proceedings of
IEEE MASCOTS 2015 conferenc
Redundancy management for efficient fault recovery in NASA's distributed computing system
The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance
Fine-sorting One-dimensional Particle-In-Cell Algorithm with Monte-Carlo Collisions on a Graphics Processing Unit
Particle-in-cell (PIC) simulations with Monte-Carlo collisions are used in
plasma science to explore a variety of kinetic effects. One major problem is
the long run-time of such simulations. Even on modern computer systems, PIC
codes take a considerable amount of time for convergence. Most of the
computations can be massively parallelized, since particles behave
independently of each other within one time step. Current graphics processing
units (GPUs) offer an attractive means for execution of the parallelized code.
In this contribution we show a one-dimensional PIC code running on Nvidia GPUs
using the CUDA environment. A distinctive feature of the code is that size of
the cells that the code uses to sort the particles with respect to their
coordinates is comparable to size of the grid cells used for discretization of
the electric field. Hence, we call the corresponding algorithm "fine-sorting".
Implementation details and optimization of the code are discussed and the
speed-up compared to classical CPU approaches is computed
Modeling and synthesis of multicomputer interconnection networks
The type of interconnection network employed has a profound effect on the performance of a multicomputer and multiprocessor design. Adequate models are needed to aid in the design and development of interconnection networks. A novel modeling approach using statistical and optimization techniques is described. This method represents an attempt to compare diverse interconnection network designs in a way that allows not only the best of existing designs to be identified but to suggest other, perhaps hybrid, networks that may offer better performance. Stepwise linear regression is used to develop a polynomial surface representation of performance in a (k+1) space with a total of k quantitative and qualitative independent variables describing graph-theoretic characteristics such as size, average degree, diameter, radius, girth, node-connectivity, edge-connectivity, minimum dominating set size, and maximum number of prime node and edge cutsets. Dependent variables used to measure performance are average message delay and the ratio of message completion rate to network connection cost. Response Surface Methodology (RSM) optimizes a response variable from a polynomial function of several independent variables. Steepest ascent path may also be used to approach optimum points
- …