47,430 research outputs found

    Loop pipelining with resource and timing constraints

    Get PDF
    Developing efficient programs for many of the current parallel computers is not easy due to the architectural complexity of those machines. The wide variety of machine organizations often makes it more difficult to port an existing program than to reprogram it completely. Therefore, powerful translators are necessary to generate effective code and free the programmer from concerns about the specific characteristics of the target machine. This work focuses on techniques to be used by an important class of translators, whose objective is to transform sequential programs into equivalent more parallel programs. The transformations are performed at instruction level in order to exploit low level parallelism and increase memory locality.Most of the current applications are programmed in languages which do not allow us to express parallelism between high-level sentences (as Pascal, C or Fortran). Furthermore, a lot of applications written ten or more years ago are still used today, and it is not feasible to rewrite such applications for many reasons (not only technical reasons, but also economic ones). Translators enable programmers to write the application in a familiar sequential programming language, without concerning their selves with the architecture of the target machine. Current compilers for parallel architectures not only translate a program written on a high-level language to the appropriate machine language, but also perform some transformations in the final code in order to execute the program in a more parallel way. The transformations improve the performance in the execution of the program by making use of the knowledge that the compiler has about the machine architecture. The semantics of the program remain intact after any transformation.Experiments show that limiting parallelization to basic blocks not included in loops limits maximum speedup. This is because loops often comprise a large portion of the parallelism available to be exploited in a program. For this reason, a lot of effort has been devoted in the recent years to parallelize loop execution. Several parallel computer architectures and compilation techniques have been proposed to exploit such a parallelism at different granularities. Multiprocessors exploit coarse grained parallelism by distributing entire loop iterations to different processors. Systems oriented to the high-level synthesis (HLS) of VLSI circuits, superscalar processors and very long instruction word (VLIW) processors exploit fine-grained parallelism at instruction level. This work addresses fine-grained parallelization of loops addressed to the HLS of VLSI circuits. Two algorithms are proposed for resource constraints and for timing constraints. An algorithm to reduce the number of registers required to execute a loop in a given architecture is also proposed.Postprint (published version

    Incremental Distance Transforms (IDT)

    Get PDF
    A new generic scheme for incremental implementations of distance transforms (DT) is presented: Incremental Distance Transforms (IDT). This scheme is applied on the cityblock, Chamfer, and three recent exact Euclidean DT (E2DT). A benchmark shows that for all five DT, the incremental implementation results in a significant speedup: 3.4×−10×. However, significant differences (i.e., up to 12.5×) among the DT remain present. The FEED transform, one of the recent E2DT, even showed to be faster than both city-block and Chamfer DT. So, through a very efficient incremental processing scheme for DT, a relief is found for E2DT’s computational burden

    Partition strategies for incremental Mini-Bucket

    Get PDF
    Los modelos en grafo probabilísticos, tales como los campos aleatorios de Markov y las redes bayesianas, ofrecen poderosos marcos de trabajo para la representación de conocimiento y el razonamiento en modelos con gran número de variables. Sin embargo, los problemas de inferencia exacta en modelos de grafos son NP-hard en general, lo que ha causado que se produzca bastante interés en métodos de inferencia aproximados. El mini-bucket incremental es un marco de trabajo para inferencia aproximada que produce como resultado límites aproximados inferior y superior de la función de partición exacta, a base de -empezando a partir de un modelo con todos los constraints relajados, es decir, con las regiones más pequeñas posibleincrementalmente añadir regiones más grandes a la aproximación. Los métodos de inferencia aproximada que existen actualmente producen límites superiores ajustados de la función de partición, pero los límites inferiores suelen ser demasiado imprecisos o incluso triviales. El objetivo de este proyecto es investigar estrategias de partición que mejoren los límites inferiores obtenidos con el algoritmo de mini-bucket, trabajando dentro del marco de trabajo de mini-bucket incremental. Empezamos a partir de la idea de que creemos que debería ser beneficioso razonar conjuntamente con las variables de un modelo que tienen una alta correlación, y desarrollamos una estrategia para la selección de regiones basada en esa idea. Posteriormente, implementamos nuestra estrategia y exploramos formas de mejorarla, y finalmente medimos los resultados obtenidos usando nuestra estrategia y los comparamos con varios métodos de referencia. Nuestros resultados indican que nuestra estrategia obtiene límites inferiores más ajustados que nuestros dos métodos de referencia. También consideramos y descartamos dos posibles hipótesis que podrían explicar esta mejora.Els models en graf probabilístics, com bé els camps aleatoris de Markov i les xarxes bayesianes, ofereixen poderosos marcs de treball per la representació del coneixement i el raonament en models amb grans quantitats de variables. Tanmateix, els problemes d’inferència exacta en models de grafs son NP-hard en general, el qual ha provocat que es produeixi bastant d’interès en mètodes d’inferència aproximats. El mini-bucket incremental es un marc de treball per a l’inferència aproximada que produeix com a resultat límits aproximats inferior i superior de la funció de partició exacta que funciona començant a partir d’un model al qual se li han relaxat tots els constraints -és a dir, un model amb les regions més petites possibles- i anar afegint a l’aproximació regions incrementalment més grans. Els mètodes d’inferència aproximada que existeixen actualment produeixen límits superiors ajustats de la funció de partició. Tanmateix, els límits inferiors acostumen a ser massa imprecisos o fins aviat trivials. El objectiu d’aquest projecte es recercar estratègies de partició que millorin els límits inferiors obtinguts amb l’algorisme de mini-bucket, treballant dins del marc de treball del mini-bucket incremental. La nostra idea de partida pel projecte es que creiem que hauria de ser beneficiós per la qualitat de l’aproximació raonar conjuntament amb les variables del model que tenen una alta correlació entre elles, i desenvolupem una estratègia per a la selecció de regions basada en aquesta idea. Posteriorment, implementem la nostra estratègia i explorem formes de millorar-la, i finalment mesurem els resultats obtinguts amb la nostra estratègia i els comparem a diversos mètodes de referència. Els nostres resultats indiquen que la nostra estratègia obté límits inferiors més ajustats que els nostres dos mètodes de referència. També considerem i descartem dues possibles hipòtesis que podrien explicar aquesta millora.Probabilistic graphical models such as Markov random fields and Bayesian networks provide powerful frameworks for knowledge representation and reasoning over models with large numbers of variables. Unfortunately, exact inference problems on graphical models are generally NP-hard, which has led to signifi- cant interest in approximate inference algorithms. Incremental mini-bucket is a framework for approximate inference that provides upper and lower bounds on the exact partition function by, starting from a model with completely relaxed constraints, i.e. with the smallest possible regions, incrementally adding larger regions to the approximation. Current approximate inference algorithms provide tight upper bounds on the exact partition function but loose or trivial lower bounds. This project focuses on researching partitioning strategies that improve the lower bounds obtained with mini-bucket elimination, working within the framework of incremental mini-bucket. We start from the idea that variables that are highly correlated should be reasoned about together, and we develop a strategy for region selection based on that idea. We implement the strategy and explore ways to improve it, and finally we measure the results obtained using the strategy and compare them to several baselines. We find that our strategy performs better than both of our baselines. We also rule out several possible explanations for the improvement

    Fast Algorithms for the Computation of Ranklets

    Get PDF
    Ranklets are orientation selective rank features with applications to tracking, face detection, texture and medical imaging. We introduce efficient algorithms that reduce their computational complexity from O(N logN) to O(!N + k), where N is the area of the filter. Timing tests show a speedup of one order of magnitude for typical usage, which should make Ranklets attractive for real-time applications

    Global Trajectory Optimisation : Can We Prune the Solution Space When Considering Deep Space Manoeuvres? [Final Report]

    Get PDF
    This document contains a report on the work done under the ESA/Ariadna study 06/4101 on the global optimization of space trajectories with multiple gravity assist (GA) and deep space manoeuvres (DSM). The study was performed by a joint team of scientists from the University of Reading and the University of Glasgow

    Metrological characterization of a vision-based system for relative pose measurements with fiducial marker mapping for spacecrafts

    Get PDF
    An improved approach for the measurement of the relative pose between a target and a chaser spacecraft is presented. The selected method is based on a single camera, which can be mounted on the chaser, and a plurality of fiducial markers, which can be mounted on the external surface of the target. The measurement procedure comprises of a closed-form solution of the Perspective from n Points (PnP) problem, a RANdom SAmple Consensus (RANSAC) procedure, a non-linear local optimization and a global Bundle Adjustment refinement of the marker map and relative poses. A metrological characterization of the measurement system is performed using an experimental set-up that can impose rotations combined with a linear translation and can measure them. The rotation and position measurement errors are calculated with reference instrumentations and their uncertainties are evaluated by the Monte Carlo method. The experimental laboratory tests highlight the significant improvements provided by the Bundle Adjustment refinement. Moreover, a set of possible influencing physical parameters are defined and their correlations with the rotation and position errors and uncertainties are analyzed. Using both numerical quantitative correlation coefficients and qualitative graphical representations, the most significant parameters for the final measurement errors and uncertainties are determined. The obtained results give clear indications and advice for the design of future measurement systems and for the selection of the marker positioning on a satellite surface
    corecore