205 research outputs found

    Improving Energy Efficiency of Application-Specific Instruction-Set Processors

    Get PDF
    Present-day consumer mobile devices seem to challenge the concept of embedded computing by bringing the equivalent of supercomputing power from two decades ago into hand-held devices. This challenge, however, is well met by pushing the boundaries of embedded computing further into areas previously monopolised by Application-Specific Integrated Circuits (ASICs). Furthermore, in areas traditionally associated with embedded computing, an increase in the complexity of algorithms and applications requires a continuous rise in availability of computing power and energy efficiency in order to fit within the same, or smaller, power budget. It is, ultimately, the amount of energy the application execution consumes that dictates the usefulness of a programmable embedded system, in comparison with implementation of an ASIC.This Thesis aimed to explore the energy efficiency overheads of Application-Specific InstructionSet Processors (ASIPs), a class of embedded processors aiming to compete with ASICs. While an ASIC can be designed to provide precise performance and energy efficiency required by a specific application without unnecessary overheads, the cost of design and verification, as well as the inability to upgrade or modify, favour more flexible programmable solutions. The ASIP designs can match the computing performance of the ASIC for specific applications. What is left, therefore, is achieving energy efficiency of a similar order of magnitude.In the past, one area of ASIP design that has been identified as a major consumer of energy is storage of temporal values produced during computation – the Register File (RF), with the associated interconnection network to transport those values between registers and computational Function Units (FUs). In this Thesis, the energy efficiency of RF and interconnection network is studied using the Transport Triggered Architectures (TTAs) template. Specifically, compiler optimisations aiming at reducing the traffic of temporal values between RF and FUs are presented in this Thesis. Bypassing of the temporal value, from the output of the FU which produces it directly in the input ports of the FUs that require it to continue with the computation, saves multiple RF reads. In addition, if all the uses of such a temporal value can be bypassed, the RF write can be eliminated as well. Such optimisations result in a simplification of the RF, via a reduction in the actual number of registers present or a reduction in the number of read and write ports in the RF and improved energy efficiency. In cases where the limited number of the simultaneous RF reads or writes cause a performance bottleneck, such optimisations result in performance improvements leading to faster execution times, therefore, allowing for execution at lower clock frequencies resulting in additional energy savings.Another area of the ASIP design consuming a significant amount of energy is the instruction memory subsystem, which is the artefact required for the programmability of the embedded processor. As this subsystem is not present in ASIC, the energy consumed for storing an application program and reading it from the instruction memories to control processor execution is an overhead that needs to be minimised. In this Thesis, one particular tool to improve the energy efficiency of the instruction memory subsystem – instruction buffer – is examined. While not trivially obvious, the presence of buffers for storing loop bodies, or parts of them, results in a reduced number of reads from the instruction memories. As a result, memories can be put to lower power state leading to lower overall energy consumption, pending energy-efficient buffer implementation. Specifically, an energy-efficient implementation of the instruction buffer is presented in this Thesis, together with analysis tools to identify candidate loops and assess their suitability for storing in the instruction buffer.The studies presented in this Thesis show that the energy overheads associated with the use of embedded processors, in comparison to ad-hoc ASIC solutions, are manageable when carefully considered during the design of an embedded system for a particular application, or application domain. Finally, the methods presented in this Thesis do not restrict the reprogrammability of the embedded system

    Software tools for the rapid development of signal processing and communications systems on configurable platforms

    Get PDF
    Programmers and engineers in the domains of high performance computing (HPC) and electronic system design have a shared goal: to define a structure for coordination and communication between nodes in a highly parallel network of processing tasks. Practitioners in both of these fields have recently encountered additional constraints that motivate the use of multiple types of processing device in a hybrid or heterogeneous platform, but constructing a working "program" to be executed on such an architecture is very time-consuming with current domain-specific design methodologies. In the field of HPC, research has proposed solutions involving the use of alternative computational devices such as FPGAs (field-programmable gate arrays), since these devices can exhibit much greater performance per unit of power consumption. The appeal of integrating these devices into traditional microprocessor-based systems is mitigated, however, by the greater difficulty in constructing a system for the resulting hybrid platform. In the field of electronic system design, a similar problem of integration exists. Many of the highly parallel FPGA-based systems that Xilinx and its customers produce for applications such as telecommunications and video processing require the additional use of one or more microprocessors, but coordinating the interactions between existing FPGA cores and software running on the microprocessors is difficult. The aim of my project is to improve the design flow for hybrid systems by proposing, firstly, an abstract representation of these systems and their components which captures in metadata their different models of computation and communication; secondly, novel design checking, exploration and optimisation techniques based around this metadata; and finally, a novel design methodology in which component and system metadata is used to generate software simulation models. The effectiveness of this approach will be evaluated through the implementation of two physical-layer telecommunications system models that meet the requirements of the 3GPP "LTE" standard, which is commercially relevant to Xilinx and many other organisations

    Analysing and Reducing Costs of Deep Learning Compiler Auto-tuning

    Get PDF
    Deep Learning (DL) is significantly impacting many industries, including automotive, retail and medicine, enabling autonomous driving, recommender systems and genomics modelling, amongst other applications. At the same time, demand for complex and fast DL models is continually growing. The most capable models tend to exhibit highest operational costs, primarily due to their large computational resource footprint and inefficient utilisation of computational resources employed by DL systems. In an attempt to tackle these problems, DL compilers and auto-tuners emerged, automating the traditionally manual task of DL model performance optimisation. While auto-tuning improves model inference speed, it is a costly process, which limits its wider adoption within DL deployment pipelines. The high operational costs associated with DL auto-tuning have multiple causes. During operation, DL auto-tuners explore large search spaces consisting of billions of tensor programs, to propose potential candidates that improve DL model inference latency. Subsequently, DL auto-tuners measure candidate performance in isolation on the target-device, which constitutes the majority of auto-tuning compute-time. Suboptimal candidate proposals, combined with their serial measurement in an isolated target-device lead to prolonged optimisation time and reduced resource availability, ultimately reducing cost-efficiency of the process. In this thesis, we investigate the reasons behind prolonged DL auto-tuning and quantify their impact on the optimisation costs, revealing directions for improved DL auto-tuner design. Based on these insights, we propose two complementary systems: Trimmer and DOPpler. Trimmer improves tensor program search efficacy by filtering out poorly performing candidates, and controls end-to-end auto-tuning using cost objectives, monitoring optimisation cost. Simultaneously, DOPpler breaks long-held assumptions about the serial candidate measurements by successfully parallelising them intra-device, with minimal penalty to optimisation quality. Through extensive experimental evaluation of both systems, we demonstrate that they significantly improve cost-efficiency of autotuning (up to 50.5%) across a plethora of tensor operators, DL models, auto-tuners and target-devices

    KINE[SIS]TEM'17 From Nature to Architectural Matter

    Get PDF
    Kine[SiS]tem – From Kinesis + System. Kinesis is a non-linear movement or activity of an organism in response to a stimulus. A system is a set of interacting and interdependent agents forming a complex whole, delineated by its spatial and temporal boundaries, influenced by its environment. How can architectural systems moderate the external environment to enhance comfort conditions in a simple, sustainable and smart way? This is the starting question for the Kine[SiS]tem’17 – From Nature to Architectural Matter International Conference. For decades, architectural design was developed despite (and not with) the climate, based on mechanical heating and cooling. Today, the argument for net zero energy buildings needs very effective strategies to reduce energy requirements. The challenge ahead requires design processes that are built upon consolidated knowledge, make use of advanced technologies and are inspired by nature. These design processes should lead to responsive smart systems that deliver the best performance in each specific design scenario. To control solar radiation is one key factor in low-energy thermal comfort. Computational-controlled sensor-based kinetic surfaces are one of the possible answers to control solar energy in an effective way, within the scope of contradictory objectives throughout the year.FC

    Applications Development for the Computational Grid

    Get PDF

    Third International Symposium on Space Mission Operations and Ground Data Systems, part 2

    Get PDF
    Under the theme of 'Opportunities in Ground Data Systems for High Efficiency Operations of Space Missions,' the SpaceOps '94 symposium included presentations of more than 150 technical papers spanning five topic areas: Mission Management, Operations, Data Management, System Development, and Systems Engineering. The symposium papers focus on improvements in the efficiency, effectiveness, and quality of data acquisition, ground systems, and mission operations. New technology, methods, and human systems are discussed. Accomplishments are also reported in the application of information systems to improve data retrieval, reporting, and archiving; the management of human factors; the use of telescience and teleoperations; and the design and implementation of logistics support for mission operations. This volume covers expert systems, systems development tools and approaches, and systems engineering issues

    Money and sustainability: Transitioning to an ecological monetary system

    Get PDF
    A profound transformation of our monetary paradigm is urgently needed. To re-think, re-imagine, and re-design our monetary system is of critical priority if we want to have a chance at sustainability. The current dominant monetary-banking-financial system is inherently, and by design, a source and a force of unsustainability lying at the core of our economies and societies. It's a system actively contributing to ecological degradation, socio-political crises, and economic instability, uncertainty, and alienation. But there are alternatives and these must be given the spotlight. Not tweaks or reforms to the system, but radical shifts in how we deal, use, relate to, and feel regarding money. The societal challenge we must embrace is rapidly transitioning our monetary reality into a purposeful ecological monetary ecosystem aligned with the regeneration of our planet and all life in it. This Doctoral thesis contributes to the emergence and development of a new monetary paradigm with planet and people at its core. The research is intrinsically transdisciplinary and based on mixed-methods. Different methodologies were used, combining qualitative with quantitative methods and more passive research with more action-oriented transformative research, including field visits, interviews with practioners, and direct interaction with local and regional complementary currency experiments. By combining a transdisciplinary literature review with an action-research approach this thesis offers novel insights into the transition process to an ecological monetary ecosystem. A set of regenerative principles and priorities for monetary reform that would enable us to root money back into the real economy, coherent with the laws of physics and aligned with an ecology of life is offered. Moreover, a model for a multi-currency ecosystem is explored and presented at the end of this thesis. The implications of such a fundamental revolution in the core design of our increasingly monetized economies could potentially put us back on track and re-align our socio-economic and political system with our climate agreements, our SDG and our intentions for peace and prosperity.Uma profunda transformação do nosso paradigma monetário é urgentemente necessária. Re-pensar, re-imaginar, e re-desenhar o nosso sistema monetário é uma prioridade societal crítica se quisermos garantir a nossa sustentabilidade. O actual sistema monetário-bancário-financeiro dominante é inerentemente, e por design, uma fonte e uma força de insustentabilidade que se encontra no cerne das nossas economias e sociedades. É um sistema que contribui activamente para a degradação ecológica, crises sócio-políticas, e para a instabilidade, incerteza e alienação económica. Mas existem alternativas e estas têm de ser objecto de atenção especial. Não ajustamentos ou "reformas" ao sistema, mas mudanças radicais na forma como lidamos, utilizamos, nos relacionamos e sentimos em relação ao dinheiro. O desafio social que temos de abraçar é a rápida transição das nossas realidades monetárias para um ecossistema propositadamente alinhado com a regeneração do nosso planeta e de toda a vida. Esta tese de doutoramento contribui para a emergência e desenvolvimento de um novo paradigma monetário com o planeta e as pessoas no seu âmago. A investigação é intrinsecamente transdisciplinar e baseada numa abordagem de métodos mistos. Foram utilizadas diferentes metodologias, combinando métodos qualitativos com métodos quantitativos, e investigação mais passiva com investigação transformadora mais orientada para a acção, incluindo visitas de campo, entrevistas e interacção directa com experiências de moedas complementares locais e regionais. Ao combinar uma revisão transdisciplinar da literatura, com uma abordagem de investigaçãoacção, esta tese oferece novas ideias e concepções sobre o processo de transição para um ecossistema monetário ecológico. É oferecido um conjunto de princípios regenerativos e prioridades para a reforma monetária que nos permitiria enraizar o dinheiro de volta à economia real, coerente com as leis da física e alinhado com uma ecologia da vida. No final da tese é ainda explorado e apresentado um modelo para um ecosistema monetário com base na co-existência de múltiplos circuitos monetários. As implicações de uma tal revolução no nosso sistema monetário e no centro das nossas economias, cada vez mais monetizadas, poderão ser potenciadoras de uma transição para um novo caminho societal, alinhado com os nossos acordos climáticos, os nossos ODS e as nossas intenções de paz e prosperidade

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp
    corecore