1,235 research outputs found

    Fault tolerant architectures for integrated aircraft electronics systems

    Get PDF
    Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered

    O pior caso estático de otimização do tempo de execução utilizando dpso para arquitetura ASIP

    Get PDF
    Introduction: The application of specific instructions significantly improves energy, performance, and code size of configurable processors. The design of these instructions is performed by the conversion of patterns related to application-specific operations into effective complex instructions. This research was presented at the icitkm Conference, University of Delhi, India in 2017. Methods: Static analysis was a prominent research method during late the 1980’s. However, end-to-end measurements consist of a standard approach in industrial settings. Both static analysis tools perform at a high-level in order to determine the program structure, which works on source code, or is executable in a disassembled binary. It is possible to work at a low-level if the real hardware timing information for the executable task has the desired features. Results: We experimented, tested and evaluated using a H.264 encoder application that uses nine cis, covering most of the computation intensive kernels. Multimedia applications are frequently subject to hard real time constraints in the field of computer vision. The H.264 encoder consists of complicated control flow with more number of decisions and nested loops. The parameters evaluated were different numbers of A partitions (300 slices on a Xilinx Virtex 7each), reconfiguration bandwidths, as well as relations of cpu frequency and fabric frequency fCPU/ffabric. ffabric remains constant at 100MHz, and we selected a multiplicity of its values for fCPU that resemble realistic units. Note that while we anticipate the wcet in seconds (wcetcycles/ f CPU) to be lower (better) with higher fCPU, the wcet cycles increase (at a constant ffabric) because hardware cis perform less computations on the reconfigurable fabric within one cpu cycle.    IntroducciĂłn: la aplicaciĂłn de instrucciones especĂ­ficas mejora significativamente la energĂ­a, el rendimiento y el tamaño del cĂłdigo de los procesadores configurables. El diseño de estas instrucciones se realiza mediante conversiĂłn de patrones relacionados con operaciones especĂ­ficas de la aplicaciĂłn con instrucciones complejas y efectivas. Esta investigaciĂłn se presentĂł en la Conferencia icitkm, Universidad de Delhi, India en 2017. MĂ©todos: el análisis estático fue un mĂ©todo de investigaciĂłn prominente durante la dĂ©cada de 1980; sin embargo, las mediciones de extremo a extremo son un enfoque convencional en los entornos industriales. Ambas herramientas de análisis estático se desempeñan a un alto nivel para determinar la estructura del programa que funciona en el cĂłdigo fuente, o que se ejecuta en un binario desmontado. Es posible trabajar a bajo nivel si la informaciĂłn de tiempo de hardware real para la tarea ejecutable presenta las caracterĂ­sticas deseadas.  Introdução: a aplicação de instruções especĂ­ficas melhora significativamente a energia, o desempenho e o tamanho do cĂłdigo dos processadores configuráveis. O desenho dessas instruções Ă© realizado mediante a conversĂŁo de padrões relacionados com operações especĂ­ficas da aplicação com instruções complexas e efetivas. Esta pesquisa foi apresentada na ConferĂŞncia icitkm, Universidade de DĂ©lhi, ĂŤndia em 2017.MĂ©todos: a análise estática foi um mĂ©todo de pesquisa proeminente durante a dĂ©cada de 1980; contudo, as medições de extremo a extremo sĂŁo uma abordagem convencional nos contextos industriais. Ambas as ferramentas de análise estática se desempenham a um alto nĂ­vel para determinar a estrutura do programa que funciona no cĂłdigo fonte ou que se executa num binário desmontado. É possĂ­vel trabalhar a baixo nĂ­vel se a informação de tempo de hardware real para a tarefa executável apresentar as caracterĂ­sticas desejadas.Resultados: experimentamos, testamos e avaliamos com uma aplicação de codificação H.264 que utiliza nove elementos de configuração e cobre a maioria dos nĂşcleos de cálculo intensivo. As aplicações multimĂ­dias estĂŁo com frequĂŞncia sujeitas a duras restrições em tempo real no campo da visĂŁo por computador. O codificador H.264 consiste num complicado fluxo de controle com mais nĂşmero de decisões e circuitos aninhados. Os parâmetros avaliados foram de diferentes nĂşmeros de particiones A (300 cortes num Xilinx Virtex 7 cada um) e largos de banda de reconfiguração, bem como de relações de frequĂŞncia de cpu e frequĂŞncia de fabric fcpu/ffabric. ffabric permanece constante a 100MHz. Selecionamos vários de seus valores para fcpu que sĂŁo semelhantes a unidades realistas. É importante considerar que, ainda quando antecipamos o wcet em segundos (ciclos wcet/ fcpu), para que fossem inferiores (melhores) com fcpu mais alta, os ciclos wcet aumentam (num tecido constante f) porque os ci de hardware realizam menos cálculos no tecido reconfigurável dentro de uma cpu de ciclo.Conclusões: o mĂ©todo Ă© similar Ă  hibridação de árvores e mĂ©todos baseados en rotas, os quais sĂŁo menos precisos, e ao mĂ©todo I pet global, que Ă© mais preciso. A otimização Ă© avaliada com o algoritmo de otimização por enxame de partĂ­culas discretas (dpso) para wcet. Para várias aplicações do mundo real que envolvem processadores integrados, a tĂ©cnica proposta desenvolve conjuntos de instruções melhoradas em comparação com os conjuntos de instruções nativas.Originalidade: para a estimativa de wcet, deve-se considerar a análise de fluxo, a análise de baixo nĂ­vel e as fases de cálculo do programa. A fase de análise de fluxo ou alto nĂ­vel de análise ajuda a extrair o comportamento dinâmico do programa que proporciona informação sobre as funções invocadas, sobre o nĂşmero de iterações de circuito, as dependĂŞncias entre sentenças if, etc. Isso se deve a que a análise desconhece a rota de execução correspondente ao tempo de execução mais longo.Limitações: essa rota Ă© executada dentro de uma iteração do nĂşcleo que depende da natureza de mb, seja i-mb, seja p-mb, determinada pelo nĂşcleo de estimativa de movimento, quer dizer que sua entrada depende das rotas i-mb e p-mb, que tambĂ©m contĂŞm elementos de configuração separados que conduzem Ă  instabilidade da rota do pior dos casos; em outras palavras, adicionar mais partições Ă  rota atual do pior dos casos pode fazer com que a outra rota se converta no pior dos casos. A tubulação se detĂ©m pela demora de reconfiguração e continua ao ingressar no nĂşcleo assim que finaliza o processo de reconfiguraçã

    Airborne Directional Networking: Topology Control Protocol Design

    Get PDF
    This research identifies and evaluates the impact of several architectural design choices in relation to airborne networking in contested environments related to autonomous topology control. Using simulation, we evaluate topology reconfiguration effectiveness using classical performance metrics for different point-to-point communication architectures. Our attention is focused on the design choices which have the greatest impact on reliability, scalability, and performance. In this work, we discuss the impact of several practical considerations of airborne networking in contested environments related to autonomous topology control modeling. Using simulation, we derive multiple classical performance metrics to evaluate topology reconfiguration effectiveness for different point-to-point communication architecture attributes for the purpose of qualifying protocol design elements

    Timestamp-Based Approach for the Detection and Resolution of Mutual Conflicts in Distributed Systems

    Get PDF
    We present a timestamp based algorithm for the detection of both write-write and read-write conflicts for a single file in distributed systems during network partitions. Our algorithm allows operations to occur in different network partitions simultaneously. When the sites from different partitions merge, the algorithm detects and resolves both read-write and write-write conflicts without taking into account the semantics of the transactions. Once the conflicts have been detected some reconciliation steps for the resolution of conflicts have also been proposed. Our algorithm will be useful in real-time systems where timeliness of operations is more important than response time (delayed commit

    Drowsy cache partitioning for reduced static and dynamic energy in the cache hierarchy

    Get PDF
    Power consumption in computing today has lead the industry towards energy efficient computing. As transistor technology shrinks, new techniques have to be developed to keep leakage current, the dominant portion of overall power consumption, to a minimum. Due to the large amount of transistors devoted to the cache hierarchy, the cache provides an excellent avenue to dramatically reduce power usage. The inherent danger with techniques that save power can negatively effect the primary reason for the inclusion of the cache, performance. This thesis work proposes a modification to the cache hierarchy that dramatically saves power with only a slight reduction in performance. By taking advantage of the overwhelming preference of memory accesses to the most recently used blocks, these blocks are placed into a small, fast access A partition. The rest of the cache is put into a drowsy mode, a state preserving technique that reduces leakage power within the remaining portion of the cache. This design was implemented within a private, second level cache that achieved an average of almost 20% dynamic energy savings and an average of nearly 45% leakage energy savings. These savings were attained while incurring an average performance penalty of only 2%

    Atomic Broadcast in Heterogeneous Distributed Systems

    Get PDF
    Communication services have long been recognized as possessing a dominant effect on both performance and robustness of distributed systems. Distributed applications rely on a multitude of protocols for the support of these services. Of crucial importance are multicast protocols. Reliable multicast protocols enhance the efficiency and robustness of distributed systems. Numerous reliable multicast protocols have been proposed, each differing in the set of assumptions adopted, especially for the communication network. These assumptions make each protocol suitable for a specific environment. The presence of different distributed applications that run on different LANs and single distributed applications that span different LANs mandate interaction between protocols on these LANs. This interaction is driven by the necessity of cooperation between individual applications. The state of the art in reliable multicast protocols renders itself inadequate for multicasting in interconnected LANs. The progress in development methodology for efficient and robust LAN software has not been matched by similar advances for WANs. A high-latency, a lower bandwidth, a higher probability of partitions, and a frequent loss of messages are the main restrictive barriers. In our work, we propose a global standard protocol that orchestrates cooperation between the different reliable broadcast protocols that run on different LANs. Our objective is to support a reliable ordered delivery service for inter-LAN messages and achieve the utmost utilization of the underlying local communication services. Our protocol suite accommodates the existence of LANs managed by autonomous authorities. To uphold this autonomy (as a defacto condition), LANs under different authorities must be able to adopt different ordering criteria for group multicasting. The developed suite assumes an environment in which multicasting groups can have members that belong to different LANs; each group can adopt either total or causal order for message delivery to its members. We also recognize the need for interaction between different reliable multicasting protocols. This interaction is a necessity in an autonomous environment in which each local authority selects a protocol that is suitable to its individual needs. Our protocols are capable of interacting with any reliable protocol that achieves a causal order as well as with all timestamp-based total-order protocols. Our protocols can also be used as a medium for interaction between existing reliable multicasting protocols. This feature opens new avenues in interactability between reliable multicasting protocols. Finally, our protocol suite enjoys a communication structure that can be aligned with the actual routing topology, which largely minimizes the necessary protocol messages

    Embedded electronic systems driven by run-time reconfigurable hardware

    Get PDF
    Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

    Topology Agnostic Methods for Routing, Reconfiguration and Virtualization of Interconnection Networks

    Get PDF
    Modern computing systems, such as supercomputers, data centers and multicore chips, generally require efficient communication between their different system units; tolerance towards component faults; flexibility to expand or merge; and a high utilization of their resources. Interconnection networks are used in a variety of such computing systems in order to enable communication between their diverse system units. Investigation and proposal of new or improved solutions to topology agnostic routing and reconfiguration of interconnection networks are main objectives of this thesis. In addition, topology agnostic routing and reconfiguration algorithms are utilized in the development of new and flexible approaches to processor allocation. The thesis aims to present versatile solutions that can be used for the interconnection networks of a number of different computing systems. No particular routing algorithm was specified for an interconnection network technology which is now incorporated in Dolphin Express. The thesis states a set of criteria for a suitable routing algorithm, evaluates a number of existing routing algorithms, and recommend that one of the algorithms – which fulfils all of the criteria – is used. Further investigations demonstrate how this routing algorithm inherently supports fault-tolerance, and how it can be optimized for some network topologies. These considerations are also relevant for the InfiniBand interconnection network technology. Reconfiguration of interconnection networks (change of routing function) is a deadlock prone process. Some existing reconfiguration strategies include deadlock avoidance mechanisms that significantly reduce the network service offered to running applications. The thesis expands the area of application for one of the most versatile and efficient reconfiguration algorithms available in the literature, and proposes an optimization of this algorithm that improves the network service offered to running applications. Moreover, a new reconfiguration algorithm is presented that supports a replacement of the routing function without causing performance penalties. Processor allocation strategies that guarantee traffic-containment commonly pose strict requirements on the shape of partitions, and thus achieve only a limited utilization of a system’s computing resources. The thesis introduces two new approaches that are more flexible. Both approaches utilize the properties of a topology agnostic routing algorithm in order to enforce traffic-containment within arbitrarily shaped partitions. Consequently, a high resource utilization as well as isolation of traffic between different partitions is achieved
    • …
    corecore