Search CORE

348 research outputs found

Simultaneous Multithreading Applied to Real Time (Artifact)

Author: Anderson James H.
Bakita Joshua J.
Osborne Sims Hill
Publication venue: DARTS - Dagstuhl Artifacts Series. Special Issue of the 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)
Publication date: 01/01/2019
Field of study

Existing models used in real-time scheduling are inadequate to take advantage of simultaneous multithreading (SMT), which has been shown to improve performance in many areas of computing, but has seen little application to real-time systems. The SMART task model, which allows for combining SMT and real time by accounting for the variable task execution costs caused by SMT, is introduced, along with methods and conditions for scheduling SMT tasks under global earliest-deadline-first scheduling. The benefits of using SMT are demonstrated through a large-scale schedulability study in which we show that task systems with utilizations 30% larger than what would be schedulable without SMT can be correctly scheduled. This artifact includes benchmark experiments used to compare execution times with and without SMT and code to duplicate the reported schedulability experiments

Dagstuhl Research Online Publication Server

Simultaneous Multithreading and Hard Real Time: Can it be Safe? (Artifact)

Author: Anderson James H.
Bakita Joshua J.
Osborne Sims Hill
Publication venue: DARTS - Dagstuhl Artifacts Series. DARTS, Volume 6, Issue 1, Special Issue of the 32nd Euromicro Conference on Real-Time Systems (ECRTS 2020)
Publication date: 01/01/2020
Field of study

Dagstuhl Research Online Publication Server

Simultaneous Multithreading Applied to Real Time

Author: Anderson James H.
Bakita Joshua J.
Osborne Sims Hill
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)
Publication date: 01/01/2019
Field of study

Dagstuhl Research Online Publication Server

Simultaneous Multithreading and Hard Real Time: Can It Be Safe?

Author: Anderson James H.
Osborne Sims Hill
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd Euromicro Conference on Real-Time Systems (ECRTS 2020)
Publication date: 01/01/2020
Field of study

The applicability of Simultaneous Multithreading (SMT) to real-time systems has been hampered by the difficulty of obtaining reliable execution costs in an SMT-enabled system. This problem is addressed by introducing a scheduling framework, called CERT-MT, that combines scheduling-aware timing analysis with a cyclic-executive scheduler in a way that minimizes SMT-related timing variations. The proposed scheduling-aware timing analysis is based on maximum observed execution times and accounts for the uncertainty inherent in measurement-based timing analysis. The timing analysis is found to work for tasks with and without SMT, though some adjustments are required in the former case. A large-scale schedulability study is presented that shows CERT-MT can schedule systems with total utilizations approaching 1.4 times the core count, without sacrificing safety

Dagstuhl Research Online Publication Server

Spatio-temporal wavelet regularization for parallel MRI reconstruction: application to functional MRI

Author: Chaari Lotfi
Ciuciu Philippe
Mériaux Sébastien
Pesquet Jean-Christophe
Publication venue
Publication date: 03/10/2013
Field of study

Parallel MRI is a fast imaging technique that enables the acquisition of highly resolved images in space or/and in time. The performance of parallel imaging strongly depends on the reconstruction algorithm, which can proceed either in the original k-space (GRAPPA, SMASH) or in the image domain (SENSE-like methods). To improve the performance of the widely used SENSE algorithm, 2D- or slice-specific regularization in the wavelet domain has been deeply investigated. In this paper, we extend this approach using 3D-wavelet representations in order to handle all slices together and address reconstruction artifacts which propagate across adjacent slices. The gain induced by such extension (3D-Unconstrained Wavelet Regularized -SENSE: 3D-UWR-SENSE) is validated on anatomical image reconstruction where no temporal acquisition is considered. Another important extension accounts for temporal correlations that exist between successive scans in functional MRI (fMRI). In addition to the case of 2D+t acquisition schemes addressed by some other methods like kt-FOCUSS, our approach allows us to deal with 3D+t acquisition schemes which are widely used in neuroimaging. The resulting 3D-UWR-SENSE and 4D-UWR-SENSE reconstruction schemes are fully unsupervised in the sense that all regularization parameters are estimated in the maximum likelihood sense on a reference scan. The gain induced by such extensions is illustrated on both anatomical and functional image reconstruction, and also measured in terms of statistical sensitivity for the 4D-UWR-SENSE approach during a fast event-related fMRI protocol. Our 4D-UWR-SENSE algorithm outperforms the SENSE reconstruction at the subject and group levels (15 subjects) for different contrasts of interest (eg, motor or computation tasks) and using different parallel acceleration factors (R=2 and R=4) on 2x2x3mm3 EPI images.Comment: arXiv admin note: substantial text overlap with arXiv:1103.353

arXiv.org e-Print Archive

Scientific Publications of the University of Toulouse II Le Mirail

INRIA a CCSD electronic archive server

Open Archive Toulouse Archive Ouverte

HAL-CEA

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Automated Experiments for Deriving Performance-relevant Properties of Software Execution Environments

Author: Hauck Michael Alexander
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2013
Field of study

The execution environment can play a crucial role when analyzing the performance of a software system. However, detecting execution environment properties and integrating such properties into performance analyses is a manual, error-prone task. In this thesis, a novel approach for detecting performance-relevant properties of the software execution environment is presented. These properties are automatically detected using predefined experiments and integrated into performance prediction tools

KITopen

DSPSR: Digital Signal Processing Software for Pulsar Astronomy

Author: Bailes
Born
Bracewell
Demorest
Edwards
Hankins
Karuppusamy
M. Bailes
Manchester
Press
van Straten
W. van Straten
Walker
Wietfeldt
Publication venue: 'CSIRO Publishing'
Publication date: 24/08/2010
Field of study

DSPSR is a high-performance, open-source, object-oriented, digital signal processing software library and application suite for use in radio pulsar astronomy. Written primarily in C++, the library implements an extensive range of modular algorithms that can optionally exploit both multiple-core processors and general-purpose graphics processing units. After over a decade of research and development, DSPSR is now stable and in widespread use in the community. This paper presents a detailed description of its functionality, justification of major design decisions, analysis of phase-coherent dispersion removal algorithms, and demonstration of performance on some contemporary microprocessor architectures.Comment: 15 pages, 10 figures, to be published in PAS

arXiv.org e-Print Archive

Crossref

Swinburne Research Bank

Design and validation of a simultaneous multi-threaded DLX processor

Author: Jacobson Hans
Publication venue: University of Utah
Publication date: 01/01/1999
Field of study

technical reportModern day computer systems rely on two forms of parallelism to achieve high performance, parallelism between individual instructions of a program (ILP) and parallelism between individual threads (TLP). Superscalar processors exploit ILP by issuing several instructions per clock, and multiprocessors (MP) exploit TLP by running different threads in parallel on different processors. A fundamental imitation of these approaches to exploit parallelism is that processor resources are statically partitioned. If TLP is low, processors in a MP system will be idle, and if ILP is low, issue slots in a superscalar processor will be wasted. As a consequence, the hardware cannot adapt to changing levels of ILP and TLP and resource utilization tend to be low. Since resource utilization is low there is potential to achieve higher performance if somehow useful instructions could be found to fill up the wasted issue slots. This paper explores a method called simultaneous multithreading (SMT) that addresses the utilization problem by letting multiple threads compete for the resources of a single processor each clock cycle thus increasing the potential ILP available

The University of Utah: J. Willard Marriott Digital Library

An application of parallel computation to Collaborative Optimization

Author: Nayyer Shahab
Publication venue: LSU Digital Commons
Publication date: 01/01/2005
Field of study

Multidisciplinary Design Optimization (MDO) has evolved as a discipline which provides a body of methods and techniques to assist engineers in solving large scale design problems. There are many frameworks for formulating MDO problems. These frameworks can be broadly classified as single-level or bi-level formulations. Collaborative Optimization (CO) is one of the popular bi-level formulations to solve an MDO problem. There are numerous design optimization problems which are highly CPU time intensive and require a long simulation time. With the advent of cheaper and faster available PC’s, distributed parallel computer clusters have become very popular. These clusters provide large computing power and can be used to solve problems faster and more efficiently. This research is an attempt to take advantage of the computational power of parallel computers in the field of design Optimization. The robust design optimization of an Internal Combustion Engine has been formulated using CO and implemented using parallel computers. Considerable savings in Wall Time has been achieved. A generic strategy for solving similar problems has also been devised. A benchmarking program has also been developed to assess theoretical speedup for any problem size. This program uses the Collaborative Optimization framework and simulates a design optimization on distributed memory clusters

Louisiana State University