Search CORE

416 research outputs found

Selective Vectorization for Short-Vector Instructions

Author: Amarasinghe Saman
Larsen Samuel
Rabbah Rodric
Publication venue
Publication date: 18/12/2009
Field of study

Multimedia extensions are nearly ubiquitous in today's general-purpose processors. These extensions consist primarily of a set of short-vector instructions that apply the same opcode to a vector of operands. Vector instructions introduce a data-parallel component to processors that exploit instruction-level parallelism, and present an opportunity for increased performance. In fact, ignoring a processor's vector opcodes can leave a significant portion of the available resources unused. In order for software developers to find short-vector instructions generally useful, however, the compiler must target these extensions with complete transparency and consistent performance. This paper describes selective vectorization, a technique for balancing computation across a processor's scalar and vector units. Current approaches for targeting short-vector instructions directly adopt vectorizing technology first developed for supercomputers. Traditional vectorization, however, can lead to a performance degradation since it fails to account for a processor's scalar resources. We formulate selective vectorization in the context of software pipelining. Our approach creates software pipelines with shorter initiation intervals, and therefore, higher performance. A key aspect of selective vectorization is its ability to manage transfer of operands between vector and scalar instructions. Even when operand transfer is expensive, our technique is sufficiently sophisticated to achieve significant performance gains. We evaluate selective vectorization on a set of SPEC FP benchmarks. On a realistic VLIW processor model, the approach achieves whole-program speedups of up to 1.35x over existing approaches. For individual loops, it provides speedups of up to 1.75x

DSpace@MIT

Compilation techniques for short-vector instructions

Author: Larsen Samuel (Samuel Barton), 1975-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 127-133).Multimedia extensions are nearly ubiquitous in today's general-purpose processors. These extensions consist primarily of a set of short-vector instructions that apply the same opcode to a vector of operands. This design introduces a data-parallel component to processors that exploit instruction-level parallelism, and presents an opportunity for increased performance. In fact, ignoring a processor's vector opcodes can leave a significant portion of the available resources unused. In order for software developers to find short-vector instructions generally useful, the compiler must target these extensions with complete transparency and consistent performance. This thesis develops compiler techniques to target short-vector instructions automatically and efficiently. One important aspect of compilation is the effective management of memory alignment. As with scalar loads and stores, vector references are typically more efficient when accessing aligned regions. In many cases, the compiler can glean no alignment information and must emit conservative code sequences. In response, I introduce a range of compiler techniques for detecting and enforcing aligned references. In my benchmark suite, the most practical method ensures alignment for roughly 75% of dynamic memory references.(cont.) This thesis also introduces selective vectorization, a technique for balancing computation across a processor's scalar and vector resources. Current approaches for targeting short-vector instructions directly adopt vectorizing technology first developed for supercomputers. Traditional vectorization, however, can lead to a performance degradation since it fails to account for a processor's scalar execution resources. I formulate selective vectorization in the context of software pipelining. My approach creates software pipelines with shorter initiation intervals, and therefore, higher performance. In contrast to conventional methods, selective vectorization operates on a low-level intermediate representation. This technique allows the algorithm to accurately measure the performance trade-offs of code selection alternatives. A key aspect of selective vectorization is its ability to manage communication of operands between vector and scalar instructions. Even when operand transfer is expensive, the technique is sufficiently sophisticated to achieve significant performance gains. I evaluate selective vectorization on a set of SPEC FP benchmarks. On a realistic VLIW processor model, the approach achieves whole-program speedups of up to 1.35x over existing approaches. For individual loops, it provides speedups of up to 1.75x.by Samuel Larsen.Ph.D

DSpace@MIT

Investigation on the Removal of Paraffin Wax Deposition by Magnetic Field

Author: Nurhakimah binti Mohd Aman Nurhakimah
Publication venue: Universiti Teknologi Petronas
Publication date: 01/01/2009
Field of study

Paraffin wax deposition creates problem to the surface facilities. One of the solutions to this problem is by applying magnetic field technology to reduce the viscosity. The effects of a magnetic field on aggregates to control paraffin wax deposition have been reviewed in technical literature. Chai Set Lee, Use of Magnetic Field in Paraffin Wax Deposition Control for Surface Facilities (2008) reported an experimental study on the influence of magnetic field on wax deposition removal and viscosity reduction. Chas Set Lee reported the configuration where the pipe is placed along the length of each magnet, on top of the pipe are both South poles while at the bottom are the North poles displayed the best magnetic field lines where all the line were at the right angle to the direction of force, this new cumulative force is bigger and acting at the same direction. This configuration is used in the wax removal experiment with different number of magnets, crude oil flow rate and temperature. However, experts in scale and paraffin control from industry leaders like Schlumberger, Nalco Chem and etc. believed that the magnetic field is something new and still in R&D stage where it is not proven in real applications. Thus, the objective of this exploration study is to find more scientific explanation on the effects of varies the numbers of magnets and the effects of applying different temperature and flow rate of crude oil in paraffin wax deposition control. Within the scope of this work the theme of magnetic technology control paraffin wax deposition is discussed based on preliminary laboratory and field results. From the results obtained in the experiments, it has been proven the effect of using magnetic field which reduces paraffin wax deposition by 2%. The efficiency of the magnetic field can be improved with some modification to the magnet strength, the crude oil flow rate and temperature. Results from this study hopefully will have significant impact in the industry to solve problems in crude oil production and transportation. This new investigation is proposed to be advance studied on the effect of magnetic field which could produce precise scientific explanation on the whole process

UTPedia

Design of robust asynchronous reconfigurable controllers for parallel synchronization using embedded graphs

Author: Guido James Sebastian
Publication venue: Newcastle University
Publication date: 01/01/2015
Field of study

PhD Thesis: This is a revised version received 24/5/16. The definitive version is the print copy in the Research Reserve Collection of the University LibrarySynchronization is a key System-on-Chip (SoC) design issue in modern technologies. As the number of operating points under consideration increases, specifications which are capable of altering key parameters such as the time available for synchronization and Mean Time Between Failures (MTBF) in response to input from the user/system become desirable. This thesis explores how a combination of parallelism and scheduling, referred to as wagging, can be utilized to construct schedulers for synchronizer designs which are capable of pooling the gain-bandwidth products of their composite devices, in order to satisfy this requirement. In this work, we explore the ways in which the areas of graph theory and reconfigurable hardware design can be applied to generate both combinational and sequential scheduler designs, which satisfy the behavior requirement above. Further to this point, this work illustrates that such a scheduler is primarily comprised of an interrupt subsystem, and a reconfigurable token ring. This thesis explores how both of these components can be controlled in absence of a clock signal, as well as the design challenges inherent to each part. The final noteworthy issue in this study is with regard to the flow control of data in a parallel synchronizer that incorporates a First-In First-Out (FIFO) buffer to decouple the reading and writing operations from each other. Such a structure incurs penalties if the data rates on both sides are not well matched. This work presents a method by which combinations of serial and parallel reading operations are used to minimize this mismatch

Newcastle University eTheses

38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science: FSTTCS 2018, December 11-13, 2018, Ahmedabad, India

Author: IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science 38. 2018 Ahmedabad
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/12/2018
Field of study

Digitale Bibliothek Thüringen

35th Symposium on Theoretical Aspects of Computer Science: STACS 2018, February 28-March 3, 2018, Caen, France

Author: STACS
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/02/2018
Field of study

Digitale Bibliothek Thüringen

Symmetry in Graph Theory

Author: Rodriguez Jose M.
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

This book contains the successful invited submissions to a Special Issue of Symmetry on the subject of ""Graph Theory"". Although symmetry has always played an important role in Graph Theory, in recent years, this role has increased significantly in several branches of this field, including but not limited to Gromov hyperbolic graphs, the metric dimension of graphs, domination theory, and topological indices. This Special Issue includes contributions addressing new results on these topics, both from a theoretical and an applied point of view

Universidad Carlos III de Madrid e-Archivo

Directory of Open Access Books (DOAB)

Proactive-reactive, robust scheduling and capacity planning of deconstruction projects under uncertainty

Author: Volk Rebekka
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

A project planning and decision support model is developed and applied to identify and reduce risk and uncertainty in deconstruction project planning. It allows calculating building inventories based on sensor information and construction standards and it computes robust project plans for different scenarios with multiple modes, constrained renewable resources and locations. A reactive and flexible planning element is proposed in the case of schedule infeasibility during project execution

Directory of Open Access Books (DOAB)

Pertanika Journal of Science & Technology

Author: Universiti Putra Malaysia Press
Publication venue: Universiti Putra Malaysia Press
Publication date: 01/01/2016
Field of study

Universiti Putra Malaysia Institutional Repository