2,763 research outputs found

    Address optimizations for embedded processors

    Get PDF
    Embedded processors that are common in electronic devices perform a limited set of tasks compared to general-purpose processor systems. They have limited resources which have to be efficiently used. Optimal utilization of program memory needs a reduction in code size which can be achieved by eliminating unnecessary address computations i.e., generate optimal offset assignment that utilizes built-in addressing modes. Single offset assignment (SOA) solutions, used for processors with one address register; start with the access sequence of variables to determine the optimal assignment. This research uses the basic block to commutatively transform statements to alter the access sequence. Edges in the access graphs are classified into breakable and unbreakable edges. Unbreakable edges are preferred when selecting edges for the assignment. Breakable edges are used to commutatively transform statements such that the assignment cost is reduced. The use of a modify register in some processors allows the address to be modified by a value in MR in addition to post-increment/decrement modes. Though finding the most beneficial value of MR is a common practice, this research shows that modifying the access sequence using edge fold, node swap, and path interleave techniques for an MR value of two has significant benefit. General offset assignment requires variables in the access sequence to be partitioned to various address registers. Use of the node degree in the access graph demonstrates greater benefit than using edge weights and frequency of variables. The Static Single Assignment (SSA) form of the basic block introduces new variables to an access graph, making it sparser. Sparser access graphs usually have lower assignment costs. The SSA form allows reuse of variable space based on variable lifetimes. Offset assignment solutions may be improved by incrementally assignment based on uncovered edges, providing the best cost improvement. This heuristic considers improvements due to all uncovered edges. Optimization techniques have primarily been edge-based. Node-based SOA technique has been tested for use with commutative transformations and shown to be better than edge-based heuristics. Heuristics developed in this research perform address optimizations for embedded processors, employing new techniques that lower address computation costs

    Design and Analysis of Heterogeneous DSP/FPGA Based Architectures for 3GPP Wireless Systems

    Get PDF
    This paper shows how iterative hardware/software partitioning in heterogeneous DSP/FPGA based embedded systems can be utilized to achieve real-time deadlines of modern 3GPP wireless equalization workloads. By utilizing a well defined set of application partitioning criteria in tandem with SOC simulation tools, we are able to show a greater than six fold improvement in application performance and ultimately meet, and even exceed real-time data processing deadlines

    MATCHED ARCHITECTURES FOR SIGNAL PROCESSING AND CONTROL

    Get PDF
    Fast processing environments for real-time data acquisition, data processing and control applications may be realised using very different architectures. State of the art systems generally employ multiprocessors and parallel processing having a dedicated architecture such as systolic arrays to support computation-intensive signal processing tasks such as, for instance, convolution, filtering, FFT. etc. Mostly, general purpose rather than application driven architectures are used whenever possible and the available literature is heavily concentrated on the first configuration. At TPD-TNO, the research emphasis is on application driven architectures. and the objectives for the so-called 'matched' architecture designs are: - Capability for a wide range of sizes, starting from small systems. The objective here is design for scalability - Design for systems to be used in harsh environments - Design for minimum connectivity. reduced communication bandwidth, incorporation of dedicated preprocessing. multibus systems, etc. The real-time behaviour of general purpose architectures is not sufficiently predictable and they are not designed to perform acquisition tasks or data-intensive processing with high performance. Matched architectures, on the contrary, are designed for well defined applications and optimized for each application, The key effort in matched architecture research is directed towards efficiently mapping algorithms to processing steps in hardware (and software) architectures. Essentially. the design process is iterative

    Compilation and Scheduling Techniques for Embedded Systems

    Get PDF
    Embedded applications are constantly increasing in size, which has resulted in increasing demand on designers of digital signal processors (DSPs) to meet the tight memory, size and cost constraints. With this trend, memory requirement reduction through code compaction and variable coalescing techniques are gaining more ground. Also, as the current trend in complex embedded systems of using multiprocessor system-on-chip (MPSoC) grows, problems like mapping, memory management and scheduling are gaining more attention. The first part of the dissertation deals with problems related to digital signal processors. Most modern DSPs provide multiple address registers and a dedicated address generation unit (AGU) which performs address generation in parallel to instruction execution. A careful placement of variables in memory is important in decreasing the number of address arithmetic instructions leading to compact and efficient code. Chapters 2 and 3 present effective heuristics for the simple and the general offset assignment problems with variable coalescing. A solution based on simulated annealing is also presented. Chapter 4 presents an optimal integer linear programming (ILP) solution to the offset assignment problem with variable coalescing and operand permutation. A new approach to the general offset assignment problem is introduced. Chapter 5 presents an optimal ILP formulation and a genetic algorithm solution to the address register allocation problem (ARA) with code transformation techniques. The ARA problem is used to generate compact codes for array-intensive embedded applications. In the second part of the dissertation, we study problems related to MPSoCs. MPSoCs provide the flexibility to meet the performance requirements of multimedia applications while respecting the tight embedded system constraints. MPSoC-based embedded systems often employ software-managed memories called scratch-pad memories (SPM). Scheduling the tasks of an application on the processors and partitioning the available SPM budget among those processors are two critical issues in reducing the overall computation time. Traditionally, the step of task scheduling is applied separately from the memory partitioning step. Such a decoupled approach may miss better quality schedules. Chapters 6 and 7 present effective heuristics that integrate task allocation and SPM partitioning to further reduce the execution time of embedded applications for single and multi-application scenarios

    Memory optimization techniques for embedded systems

    Get PDF
    Embedded systems have become ubiquitous and as a result optimization of the design and performance of programs that run on these systems have continued to remain as significant challenges to the computer systems research community. This dissertation addresses several key problems in the optimization of programs for embedded systems which include digital signal processors as the core processor. Chapter 2 develops an efficient and effective algorithm to construct a worm partition graph by finding a longest worm at the moment and maintaining the legality of scheduling. Proper assignment of offsets to variables in embedded DSPs plays a key role in determining the execution time and amount of program memory needed. Chapter 3 proposes a new approach of introducing a weight adjustment function and showed that its experimental results are slightly better and at least as well as the results of the previous works. Our solutions address several problems such as handling fragmented paths resulting from graph-based solutions, dealing with modify registers, and the effective utilization of multiple address registers. In addition to offset assignment, address register allocation is important for embedded DSPs. Chapter 4 develops a lower bound and an algorithm that can eliminate the explicit use of address register instructions in loops with array references. Scheduling of computations and the associated memory requirement are closely inter-related for loop computations. In Chapter 5, we develop a general framework for studying the trade-off between scheduling and storage requirements in nested loops that access multi-dimensional arrays. Tiling has long been used to improve the memory performance of loops. Only a sufficient condition for the legality of tiling was known previously. While it was conjectured that the sufficient condition would also become necessary for large enough tiles, there had been no precise characterization of what is large enough. Chapter 6 develops a new framework for characterizing tiling by viewing tiles as points on a lattice. This also leads to the development of conditions under the legality condition for tiling is both necessary and sufficient

    Alocação global de registradores de endereçamento para referencias a vetores em DSPs

    Get PDF
    Orientador: Guido Costa Souza de AraujoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O avanço tecnológico dos sistemas computacionais tem proporcionado o crescimento do mercado de sistemas dedicados, cada vez mais comuns no dia-a-dia das pessoas, como por exemplo em telefones celulares, palmtops e sistemas de controle automotivo. Devido às suas características, estas novas aplicações requerem sistemas que aliem baixo custo, alto desempenho e baixo consumo de potência. Uma das maneiras de atender a estes requisitos é utilizando processadores especializados. Contudo, a especialização na arquitetura dos processadores impõe novos desafios para o desenvolvimento de software para estes sistemas. Em especial, os compiladores - geralmente responsáveis pela otimização de código - precisam ser adaptados para produzir código eficiente para estes novos processadores. Na área de processamento de sinais digitais, como em telefonia celular, processadores especializados, denominados DSPs2, são amplamente utilizados. Estes processadores tipicamente possuem poucos registradores de propósito geral e modos de endereçamento bastante limitados. Além disso, muitas das suas aplicações envolvem o processamento de grandes seqüências de dados, as quais são geralmente armazenadas em vetores. Como resultado, o estudo de técnicas de otimização de referências a vetores tornou-se um problema central em compilação para DSPs. Este problema, denominado Global Array Reference Allocation (GARA), é o objeto central desta dissertação. O sub-problema central de GARA consiste em se determinar, para um dado conjunto de referências a vetores que serão alocadas a um mesmo registrador de endereçamento, o menor custo das instruções que são necessárias para manter este registrador com o endereço adequado em cada ponto do programa. Nesta dissertação, este sub-problema é modelado como um problema em grafos, e provado ser NP-difícil. Além disso, é proposto um algoritmo eficiente, baseado em programação dinâmica, para resolver este sub-problema de forma exata sob certas restrições. Com base neste algoritmo, duas técnicas são propostas para resolver o problema de GARA. Resultados experimentais, obtidos pela implementação destas técnicas no compilador GCC, comparam-nas com outros resultados da literatura. Os resultados demonstram a eficácia das técnicas propostas nesta dissertaçãoAbstract: The technological advances in computing systems have stimulated the growth of the embedded systems market, which is continuously becoming more ordinary in people's lives, for example in mobile phones, palmtops and automotive control systems. Because of their characteristics, these new applications demand the combination of low cost, high performance and low power consumption. One way to meet these constraints is through the design of specialized processors. However, processor specialization imposes new challenges to the development of software for these systems. In particular, compilers - generally responsible for code optimization - need to be adapted in order to produce efficient code for these new processors. In the digital signal processing arena, such as in cellular telephones, specialized processors, known as DSPs (Digital Signal Processors), are largely used. DSPs typically have few general purpose registers and very restricted addressing modes. In addition, many DSP applications include large data streams processing, which are usually stored in arrays. As a result, studing array reference optimization techniques became an important task in compiling for DSPs. This work studies this problem, known as Global Array Reference Allocation (GARA). The central GARA subproblem consists of determining, for a given set of array references to be allocated to the same address register, the minimum cost of the instructions required to keep this register with the correct address at alI program points. In this work, this subproblem is modeled as a graph theoretical problem and proved to be NP-hard. In addition, an efficient algorithm, based on dynamic programming, is proposed to optimally solve this subproblem under some restrictions. Based on this algorithm, two techniques to solve GARA are proposed. Experimental results, from the implementation of these techniques in the GCC compiler, compare them with previous work in the literature. The results show the effectiveness of the techniques proposed in this workMestradoMestre em Ciência da Computaçã

    May 1, 2010 (Pages 2233-2382)

    Get PDF

    Improving the Clinical Application of Genetic Testing for Patients with Inherited Heart Disease

    Get PDF
    The difficulties associated with the clinical application of next generation sequencing (NGS) approaches can be substantial. However, the technology holds great potential to improve outcomes and risk management for inherited heart disease patients and their families. This PhD thesis focuses on three critical challenges associated with the clinical application of NGS technologies; (i) Understanding correlations between genetics and the clinical phenotype; (ii) the impact of uncertainty created by NGS-based genetic testing on the patient; and (iii) developing evidence-based approaches for improving current and future methods for returning complex genetic results. Two studies focused on understanding correlations between genetics and the clinical phenotype. Firstly, the study reported in Chapter two, which aimed to analyse genetic and phenotypic findings from a consecutive cohort of hypertrophic cardiomyopathy (HCM) families who had undergone comprehensive cardiac panel testing. We showed that increasing the number of genes included on HCM panels beyond the eight established sarcomere genes does not increase the diagnostic yield. However, identification of variants of uncertain significance increased dramatically from 13% to 36%. This result has important clinical implications. Increasing the number of genes screened does not necessarily improve the chance of identifying a family’s cause of disease but is likely to increase identification of uncertain results that can increase the complexity of discussions around inheritance and risk. We also showed that identification of multiple rare variants was associated with earlier disease onset, greater likelihood of family history of sudden cardiac death (SCD) and overall worse event-free survival (5/18 events versus 29/170 events, log-rank test p=0.008). Clinical heterogeneity is a hallmark feature of HCM and here we show one factor that can contribute to worse outcomes. The study reported in Chapter three aimed to describe the diverse genetic and phenotypic features of an international cohort of patients with a truncating variant in the desmoplakin (DSP) gene traditionally considered to cause arrhythmogenic right ventricular cardiomyopathy (ARVC). This is the first complete report of the diverse phenotype spectrum of DSP cardiomyopathy, suggesting that truncating variants in DSPshould be considered its own entity. We show that the phenotype associated with truncations in DSP is severe, with a high rate of SCD and characterised by left ventricular dysfunction and structural left ventricular involvement when compared with ‘classic’ ARVC. In addition, we show the functional domain in which DSP truncating variants reside has clinical significance. Key results from Chapter two and three emphasise the importance of correlating genetic variants with phenotype information. Data from this work is a step towards precision medicine whereby patients’ risk will be assessed and managed based on their genotype. In Chapter four we designed a qualitative study to explore attitudes, preferences, recall and psychosocial consequences of uncertain genetic results returned to HCM probands. The major themes we identified were knowledge and recall of complex genetic information, individual experiences with HCM genetic testing and communication and the value of information. In addition, those with uninformative results had a unique set of issues. We found that HCM probands undergoing genetic testing require additional support and information beyond the current practice model employed in the multidisciplinary specialist clinic setting. Importantly, many of the probands interviewed who received an uninformative or uncertain genetic result showed poor recall and understanding of genetic information. The final aim was to determine if a genetic counsellor led intervention using a communication aid for the delivery of HCM genetic test results improves the ability and confidence of the proband to communicate genetic results to at-risk relatives. This work is presented in Chapter five. We developed a study protocol for a randomised controlled trial with the primary outcome being the ability and confidence of the proband to communicate genetic results to at-risk relatives. We focused on transforming findings from the previous chapters into improved clinical practice. The a priori primary outcome did not show statistically significant differences between the control and intervention group, though the majority of probands in the intervention group achieved fair communication scores and had higher genetic knowledge scores than those in the control group. Importantly, we found that 29% of at-risk relatives were not informed of a genetic result in their family. We highlight the significant gap in our current approach to supporting family communication about genetics. The studies presented highlight that whilst genetic testing has significant potential for benefit in inherited heart disease families in terms of diagnosis, management and family screening there are issues to address in order to improve the clinical utility and application of a comprehensive approach to testing. Overall, the work contributes to understanding the genetic architecture of inherited heart diseases, the clinical impact of NGS results for patients, as well as highlighting that more work is needed to improve the clinical utility from an NGS approach to genetic testing

    Study of optical techniques for the Ames unitary wind tunnel: Digital image processing, part 6

    Get PDF
    A survey of digital image processing techniques and processing systems for aerodynamic images has been conducted. These images covered many types of flows and were generated by many types of flow diagnostics. These include laser vapor screens, infrared cameras, laser holographic interferometry, Schlieren, and luminescent paints. Some general digital image processing systems, imaging networks, optical sensors, and image computing chips were briefly reviewed. Possible digital imaging network systems for the Ames Unitary Wind Tunnel were explored
    corecore