Search CORE

458 research outputs found

The distributed ASCI supercomputer project

The Distributed ASCI Supercomputer (DAS) is a homogeneous wide-area distributed system consisting of four cluster computers at different locations. DAS has been used for research on communication software, parallel languages and programming systems, schedulers, parallel applications, and distributed applications. The paper gives a preview of the most interesting research results obtained so far in the DAS project

VU Research Portal

Pure OAI Repository

International Migration, Integration and Social Cohesion online publications

High performance computer simulated bronchoscopy with interactive navigation.

Author
Publication venue
Publication date: 01/01/1998
Field of study

by Ping-Fu Fung.Thesis (M.Phil.)--Chinese University of Hong Kong, 1998.Includes bibliographical references (leaves 98-102).Abstract also in Chinese.Abstract --- p.ivAcknowledgements --- p.viChapter 1 --- Introduction --- p.1Chapter 1.1 --- Medical Visualization System --- p.4Chapter 1.1.1 --- Data Acquisition --- p.4Chapter 1.1.2 --- Computer-aided Medical Visualization --- p.5Chapter 1.1.3 --- Existing Systems --- p.6Chapter 1.2 --- Research Goal --- p.8Chapter 1.2.1 --- System Architecture --- p.9Chapter 1.3 --- Organization of this Thesis --- p.10Chapter 2 --- Volume Visualization --- p.11Chapter 2.1 --- Sampling Grid and Volume Representation --- p.11Chapter 2.2 --- Priori Work in Volume Rendering --- p.13Chapter 2.2.1 --- Surface VS Direct --- p.14Chapter 2.2.2 --- Image-order VS Object-order --- p.18Chapter 2.2.3 --- Orthogonal VS Perspective --- p.22Chapter 2.2.4 --- Hardware Acceleration VS Software Acceleration --- p.23Chapter 2.3 --- Chapter Summary --- p.29Chapter 3 --- IsoRegion Leaping Technique for Perspective Volume Rendering --- p.30Chapter 3.1 --- Compositing Projection in Direct Volume Rendering --- p.31Chapter 3.2 --- IsoRegion Leaping Acceleration --- p.34Chapter 3.2.1 --- IsoRegion Definition --- p.35Chapter 3.2.2 --- IsoRegion Construction --- p.37Chapter 3.2.3 --- IsoRegion Step Table --- p.38Chapter 3.2.4 --- Ray Traversal Scheme --- p.41Chapter 3.3 --- Experiment Result --- p.43Chapter 3.4 --- Improvement --- p.47Chapter 3.5 --- Chapter Summary --- p.48Chapter 4 --- Parallel Volume Rendering by Distributed Processing --- p.50Chapter 4.1 --- Multi-platform Loosely-coupled Parallel Environment Shell --- p.51Chapter 4.2 --- Distributed Rendering Pipeline (DRP) --- p.55Chapter 4.2.1 --- Network Architecture of a Loosely-Coupled System --- p.55Chapter 4.2.2 --- Data and Task Partitioning --- p.58Chapter 4.2.3 --- Communication Pattern and Analysis --- p.59Chapter 4.3 --- Load Balancing --- p.69Chapter 4.4 --- Heterogeneous Rendering --- p.72Chapter 4.5 --- Chapter Summary --- p.73Chapter 5 --- User Interface --- p.74Chapter 5.1 --- System Design --- p.75Chapter 5.2 --- 3D Pen Input Device --- p.76Chapter 5.3 --- Visualization Environment Integration --- p.77Chapter 5.4 --- User Interaction: Interactive Navigation --- p.78Chapter 5.4.1 --- Camera Model --- p.79Chapter 5.4.2 --- Zooming --- p.81Chapter 5.4.3 --- Image View --- p.82Chapter 5.4.4 --- User Control --- p.83Chapter 5.5 --- Chapter Summary --- p.87Chapter 6 --- Conclusion --- p.88Chapter 6.1 --- Final Summary --- p.88Chapter 6.2 --- Deficiency and Improvement --- p.89Chapter 6.3 --- Future Research Aspect --- p.91Appendix --- p.93Chapter A --- Common Error in Pre-multiplying Color and Opacity --- p.94Chapter B --- Binary Factorization of the Sample Composition Equation --- p.9

CUHK Digital Repository

Parallel simulation techniques for telecommunication network modelling

Author: Hind Alan
Publication venue
Publication date: 01/01/1994
Field of study

In this thesis, we consider the application of parallel simulation to the performance modelling of telecommunication networks. A largely automated approach was first explored using a parallelizing compiler to speed up the simulation of simple models of circuit-switched networks. This yielded reasonable results for relatively little effort compared with other approaches. However, more complex simulation models of packet- and cell-based telecommunication networks, requiring the use of discrete event techniques, need an alternative approach. A critical review of parallel discrete event simulation indicated that a distributed model components approach using conservative or optimistic synchronization would be worth exploring. Experiments were therefore conducted using simulation models of queuing networks and Asynchronous Transfer Mode (ATM) networks to explore the potential speed-up possible using this approach. Specifically, it is shown that these techniques can be used successfully to speed-up the execution of useful telecommunication network simulations. A detailed investigation has demonstrated that conservative synchronization performs very well for applications with good look ahead properties and sufficient message traffic density and, given such properties, will significantly outperform optimistic synchronization. Optimistic synchronization, however, gives reasonable speed-up for models with a wider range of such properties and can be optimized for speed-up and memory usage at run time. Thus, it is confirmed as being more generally applicable particularly as model development is somewhat easier than for conservative synchronization. This has to be balanced against the more difficult task of developing and debugging an optimistic synchronization kernel and the application models

Durham e-Theses

Master/worker parallel discrete event simulation

Author: Park Alfred John
Publication venue: Georgia Institute of Technology
Publication date: 16/12/2008
Field of study

The execution of parallel discrete event simulation across metacomputing infrastructures is examined. A master/worker architecture for parallel discrete event simulation is proposed providing robust executions under a dynamic set of services with system-level support for fault tolerance, semi-automated client-directed load balancing, portability across heterogeneous machines, and the ability to run codes on idle or time-sharing clients without significant interaction by users. Research questions and challenges associated with issues and limitations with the work distribution paradigm, targeted computational domain, performance metrics, and the intended class of applications to be used in this context are analyzed and discussed. A portable web services approach to master/worker parallel discrete event simulation is proposed and evaluated with subsequent optimizations to increase the efficiency of large-scale simulation execution through distributed master service design and intrinsic overhead reduction. New techniques for addressing challenges associated with optimistic parallel discrete event simulation across metacomputing such as rollbacks and message unsending with an inherently different computation paradigm utilizing master services and time windows are proposed and examined. Results indicate that a master/worker approach utilizing loosely coupled resources is a viable means for high throughput parallel discrete event simulation by enhancing existing computational capacity or providing alternate execution capability for less time-critical codes.Ph.D.Committee Chair: Fujimoto, Richard; Committee Member: Bader, David; Committee Member: Perumalla, Kalyan; Committee Member: Riley, George; Committee Member: Vuduc, Richar

Scholarly Materials And Research @ Georgia Tech

A Scalable Cluster-based Infrastructure for Edge-computing Services

Author: GRIECO R
MALANDRINO Delfina
SCARANO Vittorio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

In this paper we present a scalable and dynamic intermediary infrastruc- ture, SEcS (acronym of BScalable Edge computing Services’’), for developing and deploying advanced Edge computing services, by using a cluster of heterogeneous machines. Our goal is to address the challenges of the next-generation Internet services: scalability, high availability, fault-tolerance and robustness, as well as programmability and quick prototyping. The system is written in Java and is based on IBM’s Web Based Intermediaries (WBI) [71] developed at IBM Almaden Research Center

Archivio della Ricerca - Università di Salerno

Demand-driven, concurrent discrete event simulation

Author: Smart Colin
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Edinburgh Research Archive

Proceedings of the 2002 Winter Simulation Conference

Author: Asif Paranjpe
C. -h. Chen
E. Yücesan
J. L. Snowdon
J. M. Charnes
Jai Thomas
Jayesh Todi
Ycesan Chen Snowdon
Publication venue
Publication date: 19/11/2007
Field of study

This paper describes an extension to the existing BSP Time Warp dynamic load-balancing algorithm to allow the management of interruption from external workload. Experiments carried out on a manufacturing simulation model using different partition strategies with and without interruption from external workload show that significant performance improvement can be achieved with external workload management

CiteSeerX

Foundations and Methods for GPU based Image Synthesis

Author: Widmer Sven
Publication venue
Publication date: 01/01/2018
Field of study

Effects such as global illumination, caustics, defocus and motion blur are an integral part of generating images that are perceived as realistic pictures and cannot be distinguished from photographs. In general, two different approaches exist to render images: ray tracing and rasterization. Ray tracing is a widely used technique for production quality rendering of images. The image quality and physical correctness are more important than the time needed for rendering. Generating these effects is a very compute and memory intensive process and can take minutes to hours for a single camera shot. Rasterization on the other hand is used to render images if real-time constraints have to be met (e.g. computer games). Often specialized algorithms are used to approximate these complex effects to achieve plausible results while sacrificing image quality for performance. This thesis is split into two parts. In the first part we look at algorithms and load-balancing schemes for general purpose computing on graphics processing units (GPUs). Most of the ray tracing related algorithms (e.g. KD-tree construction or bidirectional path tracing) have unpredictable memory requirements. Dynamic memory allocation on GPUs suffers from global synchronization required to keep the state of current allocations. We present a method to reduce this overhead on massively parallel hardware architectures. In particular, we merge small parallel allocation requests from different threads that can occur while exploiting SIMD style parallelism. We speed-up the dynamic allocation using a set of constraints that can be applied to a large class of parallel algorithms. To achieve the image quality needed for feature films GPU-cluster are often used to cope with the amount of computation needed. We present a framework that employs a dynamic load balancing approach and applies fair scheduling to minimize the average execution time of spawned computational tasks. The load balancing capabilities are shown by handling irregular workloads: a bidirectional path tracer allowing renderings of complex effects at near interactive frame rates. In the second part of the thesis we try to reduce the image quality gap between production and real-time rendering. Therefore, an adaptive acceleration structure for screen-space ray tracing is presented that represents the scene geometry by planar approximations. The benefit is a fast method to skip empty space and compute exact intersection points based on the planar approximation. This technique allows simulating complex phenomena including depth-of-field rendering and ray traced reflections at real-time frame rates. To handle motion blur in combination with transparent objects we present a unified rendering approach that decouples space and time sampling. Thereby, we can achieve interactive frame rates by reusing fragments during the sampling step. The scene geometry that is potentially visible at any point in time for the duration of a frame is rendered in a rasterization step and stored in temporally varying fragments. We perform spatial sampling to determine all temporally varying fragments that intersect with a specific viewing ray at any point in time. Viewing rays can be sampled according to the lens uv-sampling to incorporate depth-of-field. In a final temporal sampling step, we evaluate the pre-determined viewing ray/fragment intersections for one or multiple points in time. This allows incorporating standard shading effects including and resulting in a physically plausible motion and defocus blur for transparent and opaque objects

TUbiblio

tuprints