Search CORE

1,671 research outputs found

Siirtoliipaisuarkkitehtuurin muuttuvanmittaisten käskyjen pakkaus

Author: Helkala Janne
Publication venue
Publication date: 04/06/2014
Field of study

The Static Random-Access Memory (SRAM) modules used for embedded microprocessor devices consume a large portion of the whole system’s power. The memory module consumes static power on keeping awake and dynamic power on memory accesses. The power dissipation of the instruction memory can be limited by using code compression methods, which reduce the memory size. The compression may require the use of variable length instruction formats in the processor. The power-efficient design of variable length instruction fetch and decode units is challenging for static multiple-issue processors, because such architectures have simple hardware to begin with, as they aim for very low power consumption on embedded platforms. The power saved by using these compression approaches, which necessitate more complex logic, is easily lost on inefficient processor design. This thesis proposes an implementation for instruction template-based compression, its decompression and two instruction fetch design alternatives for variable length instruction encoding on Transport Triggered Architecture (TTA), a static multiple-issue exposed data path architecture. Both of the new fetch and decode units are integrated into the TTA-based Co-design Environment (TCE), which is a toolset for rapid designing and prototyping of processors based on TTA. The hardware description of the fetch units is verified on a register transfer level and benchmarked using the CHStone test suite. Furthermore, the fetch units are synthesized on a 40 nm standard cell Application Specific Integrated Circuit (ASIC) technology library for area, performance and power consumption measurements. The power cost of the variable length instruction support is compared to the power savings from memory reduction, which is evaluated using HP Labs’ CACTI tool. The compression approach reaches an average program size reduction of 44% at best with a set of test programs, and the total power consumption of the system is reduced. The thesis shows that the proposed variable length fetch designs are sufficiently low-power oriented for TTA processors to benefit from the code compression

Trepo - Institutional Repository of Tampere University

Flexible Sensor Network Reprogramming for Logistics

Author: Evers L.
Havinga P.J.M.
Kuper J.
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2007
Field of study

Besides the currently realized applications, Wireless Sensor Networks can be put to use in logistics processes. However, doing so requires a level of flexibility and safety not provided by the current WSN software platforms. This paper discusses a logistics scenario, and presents SensorScheme, a runtime environment used to realize this scenario, based on semantics of the Scheme programming language. SensorScheme is a general purpose WSN platform, providing dynamic reprogramming, memory safety (sandboxing), blocking I/O, marshalled communication, compact code transport. It improves on the state of the art by making better use of the little available memory, thereby providing greater capability in terms of program size and complexity. We illustrate the use of our platform with some application examples, and provide experimental results to show its compactness, speed of operation and energy efficiency

CiteSeerX

University of Twente Research Information

Fast Fourier transforms on energy-efficient application-specific processors

Author: Pitkänen Teemu
Publication venue: Tampere University of Technology
Publication date: 01/01/2014
Field of study

Many of the current applications used in battery powered devices are from digital signal processing, telecommunication, and multimedia domains. Traditionally application-specific fixed-function circuits have been used in these designs in form of application-specific integrated circuits (ASIC) to reach the required performance and energy-efficiency. The complexity of these applications has increased over the years, thus the design complexity has increased even faster, which implies increased design time. At the same time, there are more and more standards to be supported, thus using optimised fixed-function implementations for all the functions in all the standards is impractical. The non-recurring engineering costs for integrated circuits have also increased significantly, so manufacturers can only afford fewer chip iterations. Although tailoring the circuit for a specific application provides the best performance and/or energy-efficiency, such approach lacks flexibility. E.g., if an error is found after the manufacturing, an expensive chip iteration is required. In addition, new functionalities cannot be added afterwards to support evolution of standards. Flexibility can be obtained with software based implementation technologies. Unfortunately, general-purpose processors do not provide the energy-efficiency of the fixed-function circuit designs. A useful trade-off between flexibility and performance is implementation based on application-specific processors (ASP) where programmability provides the flexibility and computational resources customised for the given application provide the performance. In this Thesis, application-specific processors are considered by using fast Fourier transform as the representative algorithm. The architectural template used here is transport triggered architecture (TTA) which resembles very long instruction word machines but the operand execution resembles data flow machines rather than traditional operand triggering. The developed TTA processors exploit inherent parallelism of the application. In addition, several characteristics of the application have been identified and those are exploited by developing customised functional units for speeding up the execution. Several customisations are proposed for the data path of the processor but it is also important to match the memory bandwidth to the computation speed. This calls for a memory organisation supporting parallel memory accesses. The proposed optimisations have been used to improve the energy-efficiency of the processor and experiments show that a programmable solution can have energy-efficiency comparable to fixed-function ASIC designs

Trepo - Institutional Repository of Tampere University

Network-driven handover in 5G

Author: Lopes Ana Rita Moreira
Publication venue
Publication date: 01/06/2020
Field of study

Currently, users’ expectations regarding technological performance are constantly increasing. An example of this is the growing consumption of multimedia content via the Internet. Multimedia applications with a variable number of users/requests have variable demand over time that may expose the limitation of the network channels. This may cause a problem of demand mobility generated by the service/application. Each generation of mobile networks has speciﬁc handover processes, which in the case of 4G can be controlled according to the applications requirements, with the possibility of multiconnectivity. This process was massiﬁed in 5G. The main contribution of this dissertation is the development and analysis of decision models for controlling the video streaming and user association to a BS in the network architecture. The scenario considered refers to a football stadium with multiple points of view – video streams – that each spectator can request to view on their cell phone or tablet. The developed simulator models the stadium scenario using a combination of services, which occur on the 5G network. Vertical handover generated by the network is used,aidedbynetworkslicing. Thenetworkslicingactsinthepartofthebandwidthdivision between the diﬀerent antennas and allows the throughput of the diﬀerent broadcast (FeMBMS)channelsto becontrolledbytheservice -theradionetworkcapacitylimitsthe throughput. The results obtained in a case of 80000 spectators who select diﬀerent beams over time, considering8basestations(BS),showthatthequalityofexperienceishighonlywhenthe handover and the control of beam diﬀusion by BS are managed according to the application requirements. The network recovers from huge peaks by handling as many requests at once as possible. Instead of the user only getting the steam in a good quality or not getting it at all, the network performs a best-eﬀort solution of downgrading the quality of multicasting in order to expend less resources with the same quantity of requests. The network state is taken into consideration. Although there are load peaks on the network, it is never congested.Atualmente, as expectativas dos utilizadores em relação à capacidade tecnológica não param de aumentar. Exemplo disso é o crescente consumo de conteúdo multimédia através da Internet. Aplicações multimédia com número variável de utilizadores e pedidos têm um ﬂuxo de serviço variável ao longo do tempo. Esta variância pode expor a limitação de canais de rede, que consequentemente pode causar um problema de mobilidade gerado pelo serviço/aplicação. Cada geração de redes móveis possui processos de handover de utilizadores especíﬁcos, que no caso da geração 4G passou a ser controlado em função das aplicações, com a possibilidade de multiconectividade. Este processo foi massiﬁcado no 5G. A principal contribuição desta dissertação é o desenvolvimento e análise de modelos de decisão para controlar a difusão de vídeo e a associação de utilizadores à rede rádio na arquitetura da rede. O cenário considerado reﬂete um estádio de futebol com vários pontos de vista - diferentes feixes de vídeo - que cada espectador pode solicitar e visualizar no seu telemóvel ou tablet. O simulador desenvolvido modela o cenário do estádio usando uma combinação de serviços, que ocorrem na rede 5G. É usado handover vertical gerado pela rede auxiliado por network slicing que atua na parte da divisão da largura de banda entre as diferentes antenas e permite que a taxa de débito dos diferentes canais de difusão (FeMBMS) seja controlada pelo serviço - a capacidade da rede rádio limita a taxa de transferência. Os resultados obtidos no caso de 80000 espectadores que selecionam diferentes feixes ao longo do tempo, considerando 8 estações base (BS), mostram que a qualidade de experiência somente é elevada quando o handover e o controlo da difusão de feixes pelas BS são geridos de acordo com os requisitos da aplicação. A rede recupera a estabilidade após enormes picos de transferência gerindo os seus recursos. Em vez do utilizador ser prejudicado na totalidade quando a rede não tem recursos e ser privado de obter serviço, é utilizado um processo alternativo em que a rede diminui a qualidade de multicasting, gastando menos recursos com a mesma quantidade de pedidos. O estado da rede é sempre tido em consideração - embora hajam picos de carga na rede, esta nunca ﬁca congestionada

Repositório da Universidade Nova de Lisboa

Laitteisto-optimoinnit matalan tehonkulutuksen prosessoreille

Author: Multanen Joonas
Publication venue
Publication date: 12/08/2015
Field of study

Modernien prosessorien suunnittelussa tehonkulutuksen huomioonottaminen on tärkeää. Pienet prosessorijärjestelmät, kuten mobiililaitteet, hyötyvät tehonkulutusoptimoinnista pidemmän akunkeston muodossa. Optimointi auttaa myös vastaamaan mobiililaitteiden usein tiukkoihin lämpösuunnittelurajoituksiin. Tehonkulutusoptimointeja voidaan tehdä kaikilla suunnittelun abstraktiotasoilla. Arkkitehtuuritasolla, optimaalisen arkkitehtuurin valitseminen ei usein ole suoraviivaista. Siirtoliipaisuarkkitehtuuria käyttävät prosessorit hyödyntävät käskytason rinnakkaisuutta tehokkaasti ja ovat hyvä valinta matalan tehonkulutuksen sovelluksiin. Niitä käyttäen prosessorisuunnittelijan on mahdollista toteuttaa erilaisia tehonkulutus- ja suorituskykyoptimointeja, joista osa on siirtoliipaisuarkkitehtuurille yksilöllisiä. Tässä diplomityössä tehtiin ensin kirjallisuuskatsaus yleisimmin käytetyistä tehonkulutus- ja suorituskykyoptimoinneista. Näistä neljä toteutettiin Tampereen Teknillisessä Yliopistossa kehitettyyn TTA-based Co-design Environment (TCE) -kehitysympäristöön, joka mahdollistaa TTA-prosessorien suunnittelun ja ohjelmoinnin. Vaikutusten analysoimiseksi optimoinnit toteutetiin kolmeen eri tarkoitusta varten suunniteltuun prosessoriytimeen, jotka syntesoitiin Synopsys Design Compilerilla. Kaikki ytimet hyötyivät optimoinneista, saavuttaen parhaassa tapauksessa 26% tehonkulutuspienennyksen, pinta-alan kasvaessa 3%

Trepo - Institutional Repository of Tampere University

Recommended from our members

Automotive embedded systems software reprogramming

Author: Schmidgall Ralf
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2012
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityThe exponential growth of computer power is no longer limited to stand alone computing systems but applies to all areas of commercial embedded computing systems. The ongoing rapid growth in intelligent embedded systems is visible in the commercial automotive area, where a modern car today implements up to 80 different electronic control units (ECUs) and their total memory size has been increased to several hundreds of megabyte. This growth in the commercial mass production world has led to new challenges, even within the automotive industry but also in other business areas where cost pressure is high. The need to drive cost down means that every cent spent on recurring engineering costs needs to be justified. A conflict between functional requirements (functionality, system reliability, production and manufacturing aspects etc.), testing and maintainability aspects is given. Software reprogramming, as a key issue within the automotive industry, solve that given conflict partly in the past. Software Reprogramming for in-field service and maintenance in the after sales markets provides a strong method to fix previously not identified software errors. But the increasing software sizes and therefore the increasing software reprogramming times will reduce the benefits. Especially if ECU’s software size growth faster than vehicle’s onboard infrastructure can be adjusted. The thesis result enables cost prediction of embedded systems’ software reprogramming by generating an effective and reliable model for reprogramming time for different existing and new technologies. This model and additional research results contribute to a timeline for short term, mid term and long term solutions which will solve the currently given problems as well as future challenges, especially for the automotive industry but also for all other business areas where cost pressure is high and software reprogramming is a key issue during products life cycle

Brunel University Research Archive

Dagstuhl News January - December 2008

Author: Wilhelm Reinhard
Publication venue: Dagstuhl Publications. Dagstuhl News
Publication date: 01/01/2008
Field of study

"Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

Dagstuhl Research Online Publication Server

NSMS probe recorder design and development

Author: Crews Daniel S.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 15/05/2009
Field of study

The real-time Non-Contact Stress Measurement System (NSMS) currently used at AEDC calculates the vibration of rotating blades by capturing the time of arrival for each blade. The time of arrival is determined by a triggering circuit that is activated when the signal from the engine probe crosses a predetermined threshold. In its current configuration, the NSMS system only saves post-processed data. A system that records the raw signals from the probes was developed to allow reprocessing the data whenever necessary. The probe recorder system consists of analog-to-digital conversion hardware to capture the signals, data storage for the files, and digital-to-analog hardware to replay the signals. The system accommodates a maximum of 32 channels, a maximum sampling rate of 20 MHz, and a total bandwidth of up to 160 megabytes per second. Sixteen-bit resolution is used in digitizing the analog waveforms to minimize quantization errors.The incoming data is transferred using FPDP, capable of 160 MB/sec, and PCI-X, capable of 528 MB/sec. Large amounts of high speed (3200 MB/sec) random access memory coupled with two dual-core processors were included for data transfer buffering and program execution. As the final destination, a RAID array connected to a PCI Express interface was implemented for 240 MB/sec data storage. Laboratory tests were conducted on the system to verify performance. The RAID array exceeded expectations for disk writing but reduced bandwidth was observed for read operations. The relationship between the input analog signals and the reproduced waveforms was checked and, except for one case, performed identically to the simulated system transfer function. Long duration tests were performed to verify data transfers at the maximum settings and proved that the system could operate continuously without data loss.Due to the large amounts of data, a brief study of offline compression techniques was conducted. Lossy compression was investigated but was not implemented at this time due to unwanted distortion and loss of critical data. Lossless compression using WinZip was implemented as a compromise between ideal compression ratios and data retention expectations

University of Tennessee, Knoxville: Trace

Cloud Rule-based System for Analysis of IoT Data in a Big Data Context

Author: Grosso Leonor Rodrigues Galhoz Vieira
Publication venue
Publication date: 01/12/2021
Field of study

Nowadays, enormous amounts of information are produced, on a daily basis, by sensors. Information which, after being analysed, is transformed from simple data into knowledge which, in itself, can be an asset to those who can take advantage of that knowledge. An example of this situation is the data being generated by sensors installed on trains, that can be analysed to different ends, one of which, the condition-based maintenance of trains. Condition-based maintenance takes advantage of data to understand the current state of mechanical equipment, avoiding unnecessary replacements or preventing accidents consequent of late maintenance. In this dissertation, it is presented an architecture which integrates a rule-based system functioning over cloud applications that analyses all the data that’s being acquired by the trains’ sensors in a way that, whenever a specific set of conditions is met alerts are activated, so the train operators, the mechanics in charge and all their staff know how to proceed. This architecture is to be created on a cloud environment since, with this vast amount of data being generated, these highly scalable environments assure that data processing performance isn’t compromised and that all this data is analysed in a timely manner, taking advantage of all its computational components. The process of creating this architecture is demonstrated step by step and the test results are presented and analysed.Nos dias de hoje são produzidas, diariamente, enormíssimas quantidades de informação por parte de sensores, informação essa que após analisada se transforma de simples dados em conhecimento que, por si, é uma mais-valia para quem pode fazer uso desse conhecimento. Um exemplo destes casos são os dados produzidos pelos sensores instalados nos comboios, que podem ser analisados com variadas finalidades, uma delas, a manutenção baseada na condição. A manutenção baseada na condição tira proveito dos dados para compreender o estado dos equipamentos, evitando substituições desnecessárias ou prevenindo acidentes consequentes de manutenções tardias. Nesta dissertação é apresentada uma arquitetura para o funcionamento de um sistema de regras na cloud, que analise todos estes dados que estão a ser adquiridos nos sensores dos comboios de forma que, quando certas condições são cumpridas, alertas sejam ativados e os operadores dos comboios, os mecânicos responsáveis e toda a equipa envolvente saibam como atuar. Esta arquitetura quer-se criada na cloud pois, com uma quantidade enorme de dados a ser gerada, estes ambientes altamente escaláveis garantem que o desempenho no processamento dos dados não fique comprometido. Garantem também que os intervalos de tempo necessários para analisar todos estes dados sejam muito pequenos, tirando partido do poder computacional disponível em tais ambientes. O processo de criação desta arquitetura é demonstrado passo a passo e os resultados obtidos são apresentados e analisados

Repositório da Universidade Nova de Lisboa

Network-on-Chip -based Multi-Processor System-on-Chip: Towards Mixed-Criticality System Certification

Author: Avramenko Serhiy
Publication venue: Politecnico di Torino
Publication date: 23/09/2019
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)