Search CORE

205 research outputs found

Improving Energy Efficiency of Application-Specific Instruction-Set Processors

Author: Guzma Vladimír
Publication venue: Tampere University of Technology
Publication date: 01/01/2017
Field of study

Present-day consumer mobile devices seem to challenge the concept of embedded computing by bringing the equivalent of supercomputing power from two decades ago into hand-held devices. This challenge, however, is well met by pushing the boundaries of embedded computing further into areas previously monopolised by Application-Speciﬁc Integrated Circuits (ASICs). Furthermore, in areas traditionally associated with embedded computing, an increase in the complexity of algorithms and applications requires a continuous rise in availability of computing power and energy efﬁciency in order to ﬁt within the same, or smaller, power budget. It is, ultimately, the amount of energy the application execution consumes that dictates the usefulness of a programmable embedded system, in comparison with implementation of an ASIC.This Thesis aimed to explore the energy efﬁciency overheads of Application-Speciﬁc InstructionSet Processors (ASIPs), a class of embedded processors aiming to compete with ASICs. While an ASIC can be designed to provide precise performance and energy efﬁciency required by a speciﬁc application without unnecessary overheads, the cost of design and veriﬁcation, as well as the inability to upgrade or modify, favour more ﬂexible programmable solutions. The ASIP designs can match the computing performance of the ASIC for speciﬁc applications. What is left, therefore, is achieving energy efﬁciency of a similar order of magnitude.In the past, one area of ASIP design that has been identiﬁed as a major consumer of energy is storage of temporal values produced during computation – the Register File (RF), with the associated interconnection network to transport those values between registers and computational Function Units (FUs). In this Thesis, the energy efﬁciency of RF and interconnection network is studied using the Transport Triggered Architectures (TTAs) template. Speciﬁcally, compiler optimisations aiming at reducing the trafﬁc of temporal values between RF and FUs are presented in this Thesis. Bypassing of the temporal value, from the output of the FU which produces it directly in the input ports of the FUs that require it to continue with the computation, saves multiple RF reads. In addition, if all the uses of such a temporal value can be bypassed, the RF write can be eliminated as well. Such optimisations result in a simpliﬁcation of the RF, via a reduction in the actual number of registers present or a reduction in the number of read and write ports in the RF and improved energy efﬁciency. In cases where the limited number of the simultaneous RF reads or writes cause a performance bottleneck, such optimisations result in performance improvements leading to faster execution times, therefore, allowing for execution at lower clock frequencies resulting in additional energy savings.Another area of the ASIP design consuming a signiﬁcant amount of energy is the instruction memory subsystem, which is the artefact required for the programmability of the embedded processor. As this subsystem is not present in ASIC, the energy consumed for storing an application program and reading it from the instruction memories to control processor execution is an overhead that needs to be minimised. In this Thesis, one particular tool to improve the energy efﬁciency of the instruction memory subsystem – instruction buffer – is examined. While not trivially obvious, the presence of buffers for storing loop bodies, or parts of them, results in a reduced number of reads from the instruction memories. As a result, memories can be put to lower power state leading to lower overall energy consumption, pending energy-efﬁcient buffer implementation. Speciﬁcally, an energy-efﬁcient implementation of the instruction buffer is presented in this Thesis, together with analysis tools to identify candidate loops and assess their suitability for storing in the instruction buffer.The studies presented in this Thesis show that the energy overheads associated with the use of embedded processors, in comparison to ad-hoc ASIC solutions, are manageable when carefully considered during the design of an embedded system for a particular application, or application domain. Finally, the methods presented in this Thesis do not restrict the reprogrammability of the embedded system

Trepo - Institutional Repository of Tampere University

Software tools for the rapid development of signal processing and communications systems on configurable platforms

Author: Perry Thomas Paul
Publication venue
Publication date: 01/01/2013
Field of study

Programmers and engineers in the domains of high performance computing (HPC) and electronic system design have a shared goal: to define a structure for coordination and communication between nodes in a highly parallel network of processing tasks. Practitioners in both of these fields have recently encountered additional constraints that motivate the use of multiple types of processing device in a hybrid or heterogeneous platform, but constructing a working "program" to be executed on such an architecture is very time-consuming with current domain-specific design methodologies. In the field of HPC, research has proposed solutions involving the use of alternative computational devices such as FPGAs (field-programmable gate arrays), since these devices can exhibit much greater performance per unit of power consumption. The appeal of integrating these devices into traditional microprocessor-based systems is mitigated, however, by the greater difficulty in constructing a system for the resulting hybrid platform. In the field of electronic system design, a similar problem of integration exists. Many of the highly parallel FPGA-based systems that Xilinx and its customers produce for applications such as telecommunications and video processing require the additional use of one or more microprocessors, but coordinating the interactions between existing FPGA cores and software running on the microprocessors is difficult. The aim of my project is to improve the design flow for hybrid systems by proposing, firstly, an abstract representation of these systems and their components which captures in metadata their different models of computation and communication; secondly, novel design checking, exploration and optimisation techniques based around this metadata; and finally, a novel design methodology in which component and system metadata is used to generate software simulation models. The effectiveness of this approach will be evaluated through the implementation of two physical-layer telecommunications system models that meet the requirements of the 3GPP "LTE" standard, which is commercially relevant to Xilinx and many other organisations

Glasgow Theses Service

Analysing and Reducing Costs of Deep Learning Compiler Auto-tuning

Author: Borowiec Damian
Publication venue: Lancaster University
Publication date: 01/03/2023
Field of study

Deep Learning (DL) is significantly impacting many industries, including automotive, retail and medicine, enabling autonomous driving, recommender systems and genomics modelling, amongst other applications. At the same time, demand for complex and fast DL models is continually growing. The most capable models tend to exhibit highest operational costs, primarily due to their large computational resource footprint and inefficient utilisation of computational resources employed by DL systems. In an attempt to tackle these problems, DL compilers and auto-tuners emerged, automating the traditionally manual task of DL model performance optimisation. While auto-tuning improves model inference speed, it is a costly process, which limits its wider adoption within DL deployment pipelines. The high operational costs associated with DL auto-tuning have multiple causes. During operation, DL auto-tuners explore large search spaces consisting of billions of tensor programs, to propose potential candidates that improve DL model inference latency. Subsequently, DL auto-tuners measure candidate performance in isolation on the target-device, which constitutes the majority of auto-tuning compute-time. Suboptimal candidate proposals, combined with their serial measurement in an isolated target-device lead to prolonged optimisation time and reduced resource availability, ultimately reducing cost-efficiency of the process. In this thesis, we investigate the reasons behind prolonged DL auto-tuning and quantify their impact on the optimisation costs, revealing directions for improved DL auto-tuner design. Based on these insights, we propose two complementary systems: Trimmer and DOPpler. Trimmer improves tensor program search efficacy by filtering out poorly performing candidates, and controls end-to-end auto-tuning using cost objectives, monitoring optimisation cost. Simultaneously, DOPpler breaks long-held assumptions about the serial candidate measurements by successfully parallelising them intra-device, with minimal penalty to optimisation quality. Through extensive experimental evaluation of both systems, we demonstrate that they significantly improve cost-efficiency of autotuning (up to 50.5%) across a plethora of tensor operators, DL models, auto-tuners and target-devices

Lancaster E-Prints

KINE[SIS]TEM'17 From Nature to Architectural Matter

Author: Oliveira M. J.
Osório F.
Publication venue: DINÂMIA'CET - IUL
Publication date: 01/05/2017
Field of study

Kine[SiS]tem – From Kinesis + System. Kinesis is a non-linear movement or activity of an organism in response to a stimulus. A system is a set of interacting and interdependent agents forming a complex whole, delineated by its spatial and temporal boundaries, influenced by its environment. How can architectural systems moderate the external environment to enhance comfort conditions in a simple, sustainable and smart way? This is the starting question for the Kine[SiS]tem’17 – From Nature to Architectural Matter International Conference. For decades, architectural design was developed despite (and not with) the climate, based on mechanical heating and cooling. Today, the argument for net zero energy buildings needs very effective strategies to reduce energy requirements. The challenge ahead requires design processes that are built upon consolidated knowledge, make use of advanced technologies and are inspired by nature. These design processes should lead to responsive smart systems that deliver the best performance in each specific design scenario. To control solar radiation is one key factor in low-energy thermal comfort. Computational-controlled sensor-based kinetic surfaces are one of the possible answers to control solar energy in an effective way, within the scope of contradictory objectives throughout the year.FC

Repositório Institucional do ISCTE-IUL

Design Space Exploration of Distributed Loop Buffer Architectures with Incompatible Loop-Nest Organisations in Embedded Systems

Author: Antonio Artes
F Catthoor
Francky Catthoor
Jose L. Ayala
M Jayapala
M Kandemir
M Verma
Praveen Raghavan
Robert Fasthuber
RS Bajwa
T Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Applications Development for the Computational Grid

Author: Abramson David
Publication venue: Monash University, InformationTechnology
Publication date: 01/01/2011
Field of study

University of Queensland eSpace

Third International Symposium on Space Mission Operations and Ground Data Systems, part 2

Author: Rash James L.
Publication venue
Publication date
Field of study

Under the theme of 'Opportunities in Ground Data Systems for High Efficiency Operations of Space Missions,' the SpaceOps '94 symposium included presentations of more than 150 technical papers spanning five topic areas: Mission Management, Operations, Data Management, System Development, and Systems Engineering. The symposium papers focus on improvements in the efficiency, effectiveness, and quality of data acquisition, ground systems, and mission operations. New technology, methods, and human systems are discussed. Accomplishments are also reported in the application of information systems to improve data retrieval, reporting, and archiving; the management of human factors; the use of telescience and teleoperations; and the design and implementation of logistics support for mission operations. This volume covers expert systems, systems development tools and approaches, and systems engineering issues

NASA Technical Reports Server

Recommended from our members

A Design-by-Contract based Approach for Architectural Modelling and Analysis

Author: Ozkaya M.
Publication venue
Publication date
Field of study

Research on software architectures has been active since the early nineties, leading to a number of different architecture description languages (ADL). Given their importance in facilitating the communication of crucial system properties to different stakeholders and their analysis early on in the development of a system this is understandable. However, practitioners rarely use ADLs, and, instead, they insist on using the Unified Modelling Language (UML) for specifying software architectures. I attribute this to three main issues that have not been addressed altogether by the existing ADLs. Firstly, in their attempt to support formal analysis, current ADLs employ formal notations (i.e., mostly process algebras) that are rarely used among practitioners. Secondly, many ADLs focus on components in specifying software architectures, neglecting the first-class specification of complex interaction protocols as connectors. They view connectors as simple interaction links that merely identify the communicating components and their basic communication style (e.g., procedure call). So, complex interaction protocols are specified as part of components, which however reduce the re-usability of both. Lastly, there are also some ADLs that do support complex connectors. However, these include a centralised glue element in their connector structure that imposes a global ordering of actions on the interacting components. Such global constraints are not always realisable in a decentralised manner by the components that participate in these protocols. In this PhD thesis, I introduce a new architecture description language called XCD that supports the formal specification of software architectures without employing a complex formal notation and offers first-class connectors for maximising the re-use of components and protocols. Furthermore, by omitting any units for specifying global constraints (i.e., glue), the architecture specifications in XCD are guaranteed to be realisable in a decentralised manner. I show in the thesis how XCD extends Design-by-Contract (DbC) for specifying (i) protocol-independent components and (ii) complex connectors, which can impose only local constraints to guarantee their realisability. Use of DbC will hopefully make it easier for practitioners to use the language, compared to languages using process algebras. I also show the precise translation of XCD into SPIN’s formal ProMeLa language for formally verifying software architectures that (i) services offered by components are always used correctly, (ii) the component behaviours are always complete, (iii)there are no race-conditions, (iv) there is no deadlock, and (v) for components having event communications, there is no overflow of event buffers. Finally, I evaluate XCD via five well-known case studies and illustrate XCD’s enhanced modularity, expressive DbC-based notation, and guaranteed realisability for architecture specifications

City Research Online

Money and sustainability: Transitioning to an ecological monetary system

Author: Alves Filipe Miguel Moreira
Publication venue
Publication date: 01/01/2022
Field of study

A profound transformation of our monetary paradigm is urgently needed. To re-think, re-imagine, and re-design our monetary system is of critical priority if we want to have a chance at sustainability. The current dominant monetary-banking-financial system is inherently, and by design, a source and a force of unsustainability lying at the core of our economies and societies. It's a system actively contributing to ecological degradation, socio-political crises, and economic instability, uncertainty, and alienation. But there are alternatives and these must be given the spotlight. Not tweaks or reforms to the system, but radical shifts in how we deal, use, relate to, and feel regarding money. The societal challenge we must embrace is rapidly transitioning our monetary reality into a purposeful ecological monetary ecosystem aligned with the regeneration of our planet and all life in it. This Doctoral thesis contributes to the emergence and development of a new monetary paradigm with planet and people at its core. The research is intrinsically transdisciplinary and based on mixed-methods. Different methodologies were used, combining qualitative with quantitative methods and more passive research with more action-oriented transformative research, including field visits, interviews with practioners, and direct interaction with local and regional complementary currency experiments. By combining a transdisciplinary literature review with an action-research approach this thesis offers novel insights into the transition process to an ecological monetary ecosystem. A set of regenerative principles and priorities for monetary reform that would enable us to root money back into the real economy, coherent with the laws of physics and aligned with an ecology of life is offered. Moreover, a model for a multi-currency ecosystem is explored and presented at the end of this thesis. The implications of such a fundamental revolution in the core design of our increasingly monetized economies could potentially put us back on track and re-align our socio-economic and political system with our climate agreements, our SDG and our intentions for peace and prosperity.Uma profunda transformação do nosso paradigma monetário é urgentemente necessária. Re-pensar, re-imaginar, e re-desenhar o nosso sistema monetário é uma prioridade societal crítica se quisermos garantir a nossa sustentabilidade. O actual sistema monetário-bancário-financeiro dominante é inerentemente, e por design, uma fonte e uma força de insustentabilidade que se encontra no cerne das nossas economias e sociedades. É um sistema que contribui activamente para a degradação ecológica, crises sócio-políticas, e para a instabilidade, incerteza e alienação económica. Mas existem alternativas e estas têm de ser objecto de atenção especial. Não ajustamentos ou "reformas" ao sistema, mas mudanças radicais na forma como lidamos, utilizamos, nos relacionamos e sentimos em relação ao dinheiro. O desafio social que temos de abraçar é a rápida transição das nossas realidades monetárias para um ecossistema propositadamente alinhado com a regeneração do nosso planeta e de toda a vida. Esta tese de doutoramento contribui para a emergência e desenvolvimento de um novo paradigma monetário com o planeta e as pessoas no seu âmago. A investigação é intrinsecamente transdisciplinar e baseada numa abordagem de métodos mistos. Foram utilizadas diferentes metodologias, combinando métodos qualitativos com métodos quantitativos, e investigação mais passiva com investigação transformadora mais orientada para a acção, incluindo visitas de campo, entrevistas e interacção directa com experiências de moedas complementares locais e regionais. Ao combinar uma revisão transdisciplinar da literatura, com uma abordagem de investigaçãoacção, esta tese oferece novas ideias e concepções sobre o processo de transição para um ecossistema monetário ecológico. É oferecido um conjunto de princípios regenerativos e prioridades para a reforma monetária que nos permitiria enraizar o dinheiro de volta à economia real, coerente com as leis da física e alinhado com uma ecologia da vida. No final da tese é ainda explorado e apresentado um modelo para um ecosistema monetário com base na co-existência de múltiplos circuitos monetários. As implicações de uma tal revolução no nosso sistema monetário e no centro das nossas economias, cada vez mais monetizadas, poderão ser potenciadoras de uma transição para um novo caminho societal, alinhado com os nossos acordos climáticos, os nossos ODS e as nossas intenções de paz e prosperidade

Repositório da Universidade Nova de Lisboa

Using MapReduce Streaming for Distributed Life Simulation on the Cloud

Author: Radenski Atanas
Publication venue: Chapman University Digital Commons
Publication date: 01/01/2013
Field of study

Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

Chapman University Digital Commons