Search CORE

33,061 research outputs found

Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

Author: Elango Venmugil
Fauzia Naznin
Pouchet Louis-Noël
Ramanujam J.
Rastello Fabrice
Ravishankar Mahesh
Rountev Atanas
Sadayappan P.
Publication venue
Publication date: 01/12/2013
Field of study

Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while the energy cost of data movement is increasingly dominant. The understanding and characterization of data locality properties of computations is critical in order to guide efforts to enhance data locality. Reuse distance analysis of memory address traces is a valuable tool to perform data locality characterization of programs. A single reuse distance analysis can be used to estimate the number of cache misses in a fully associative LRU cache of any size, thereby providing estimates on the minimum bandwidth requirements at different levels of the memory hierarchy to avoid being bandwidth bound. However, such an analysis only holds for the particular execution order that produced the trace. It cannot estimate potential improvement in data locality through dependence preserving transformations that change the execution schedule of the operations in the computation. In this article, we develop a novel dynamic analysis approach to characterize the inherent locality properties of a computation and thereby assess the potential for data locality enhancement via dependence preserving transformations. The execution trace of a code is analyzed to extract a computational directed acyclic graph (CDAG) of the data dependences. The CDAG is then partitioned into convex subsets, and the convex partitioning is used to reorder the operations in the execution trace to enhance data locality. The approach enables us to go beyond reuse distance analysis of a single specific order of execution of the operations of a computation in characterization of its data locality properties. It can serve a valuable role in identifying promising code regions for manual transformation, as well as assessing the effectiveness of compiler transformations for data locality enhancement. We demonstrate the effectiveness of the approach using a number of benchmarks, including case studies where the potential shown by the analysis is exploited to achieve lower data movement costs and better performance.Comment: Transaction on Architecture and Code Optimization (2014

arXiv.org e-Print Archive

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Exploratory Analysis of Multivariate Data (Unsupervised Image Segmentation and Data Driven Linear and Nonlinear Decomposition)

Author: Hilger Klaus Baggesen
Publication venue
Publication date: 01/03/2002
Field of study

Online Research Database In Technology

Autonomous Tissue Scanning under Free-Form Motion for Intraoperative Tissue Characterisation

Author: Cartucho Joao
Giannarou Stamatia
Zhan Jian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2020
Field of study

In Minimally Invasive Surgery (MIS), tissue scanning with imaging probes is required for subsurface visualisation to characterise the state of the tissue. However, scanning of large tissue surfaces in the presence of deformation is a challenging task for the surgeon. Recently, robot-assisted local tissue scanning has been investigated for motion stabilisation of imaging probes to facilitate the capturing of good quality images and reduce the surgeon's cognitive load. Nonetheless, these approaches require the tissue surface to be static or deform with periodic motion. To eliminate these assumptions, we propose a visual servoing framework for autonomous tissue scanning, able to deal with free-form tissue deformation. The 3D structure of the surgical scene is recovered and a feature-based method is proposed to estimate the motion of the tissue in real-time. A desired scanning trajectory is manually defined on a reference frame and continuously updated using projective geometry to follow the tissue motion and control the movement of the robotic arm. The advantage of the proposed method is that it does not require the learning of the tissue motion prior to scanning and can deal with free-form deformation. We deployed this framework on the da Vinci surgical robot using the da Vinci Research Kit (dVRK) for Ultrasound tissue scanning. Since the framework does not rely on information from the Ultrasound data, it can be easily extended to other probe-based imaging modalities.Comment: 7 pages, 5 figures, ICRA 202

arXiv.org e-Print Archive

Crossref

Recommended from our members

Automation of Determination of Optimal Intra-Compute Node Parallelism

Author: Brown James C.
Gómez-Iglesias Antonio
Publication venue
Publication date: 01/01/2016
Field of study

Maximizing the productivity of modern multicore and manycore chips requires optimizing parallelism at the compute node level. This is, however, a complex multi-step process. It is an iterative method requiring determining optimal degrees of parallel scalability and optimizing memory access behavior. Further, there are multiple cases to be considered, programs which use only MPI or OpenMP and hybrid (MPI +OpenMP) programs. This paper presents a set of three coordinated workﬂows for determining the optimal parallelism at the program level for MPI programs and at the loop level for hybrid (MPI+OpenMP) cases. The paper also details mostly automated implementations of these workﬂows using the PerfExpert infrastructure. Finally the paper presents case studies demonstrating both the applicability and the effectiveness of optimizing parallelism at the compute node level. The results shown in the paper will provide valuable information to further advance in the full automation of the workﬂows. The software implementing the parallelism scalability optimization is open source and available for download.Texas Advanced Computing Center (TACC)Computer Science

Texas ScholarWorks

Intelligent design guidance

Author: Duffy Alex H. B.
Meehan Joanne
Whitfield Robert Ian
Wu Zhichao
Publication venue
Publication date: 01/01/2003
Field of study

This paper presents results from an investigation regarding the use of the Design Structure Matrix (DSM) as a means to guide a designer through the calculation of numerical relationships within the early design system Designer. Characteristics, relationships and goals are used within Designer to enable the evaluation and approximation of the design model and are represented within the system as a digraph. Despite being a useful representation of the interactions within the design model, the digraph does not aid the designer in identifying a sequence of activities that need to be performed in order to evaluate the model. The DSM system was used to represent the characteristics and the dependencies obtained through the relationships. The sequence of characteristics within the DSM was optimised and used to produce a design process to guide the designer in model evaluation. The objective of the optimisation was to minimise the amount of iteration within the design process. The process enabled a designer who is unfamiliar with the model to evaluate it and satisfy the design goals and requirements. Both the DSM system and the Designer system are generic in nature andmay be applied to any design problem

University of Strathclyde Institutional Repository

Dynamic selection and estimation of the digital predistorter parameters for power amplifier linearization

Author: Gilabert Pinal Pere Lluís
López Bueno David
Montoro López Gabriel
Pham Thi Quynh Anh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/07/2019
Field of study

© © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.This paper presents a new technique that dynamically estimates and updates the coefficients of a digital predistorter (DPD) for power amplifier (PA) linearization. The proposed technique is dynamic in the sense of estimating, at every iteration of the coefficient's update, only the minimum necessary parameters according to a criterion based on the residual estimation error. At the first step, the original basis functions defining the DPD in the forward path are orthonormalized for DPD adaptation in the feedback path by means of a precalculated principal component analysis (PCA) transformation. The robustness and reliability of the precalculated PCA transformation (i.e., PCA transformation matrix obtained off line and only once) is tested and verified. Then, at the second step, a properly modified partial least squares (PLS) method, named dynamic partial least squares (DPLS), is applied to obtain the minimum and most relevant transformed components required for updating the coefficients of the DPD linearizer. The combination of the PCA transformation with the DPLS extraction of components is equivalent to a canonical correlation analysis (CCA) updating solution, which is optimum in the sense of generating components with maximum correlation (instead of maximum covariance as in the case of the DPLS extraction alone). The proposed dynamic extraction technique is evaluated and compared in terms of computational cost and performance with the commonly used QR decomposition approach for solving the least squares (LS) problem. Experimental results show that the proposed method (i.e., combining PCA with DPLS) drastically reduces the amount of DPD coefficients to be estimated while maintaining the same linearization performance.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC