8 research outputs found

    SAMIE-LSQ: set-associative multiple-instruction entry load/store queue

    Get PDF
    The load/store queue (LSQ) is one of the most complex parts of contemporary processors. Its latency is critical for the processor performance and it is usually one of the processor hotspots. This paper presents a highly banked, set-associative, multiple-instruction entry LSQ (SAMIE-LSQ,) that achieves high performance with small energy requirements. The SAMIE-LSQ classifies the memory instructions (loads and stores) based on the address to be accessed, and groups those instructions accessing the same cache line in the same entry. Our approach relies on the fact that many in-flight memory instructions access the same cache lines. Each SAMIE-LSQ entry has space for several memory instructions accessing the same cache line. This arrangement has a number of advantages. First, it significantly reduces the address comparison activity needed for memory disambiguation since there are less addresses to be compared. It also reduces the activity in the data TLB, the cache tag and cache data arrays. This is achieved by caching the cache line location and address translation in the corresponding SAMIE-LSQ entry once the access of one of the instructions in an entry is performed, so instructions that share an entry can reuse the translation, avoid the tag check and get the data directly from the concrete cache way without checking the others. Besides, the delay of the proposed scheme is lower than that required by a conventional LSQ. We show that the SAMIE-LSQ saves 82% dynamic energy for the load/store queue, 42% for the LI data cache and 73% for the data TLB, with a negligible impact on performance (0.6%)Peer ReviewedPostprint (published version

    Designing Fault-Injection Experiments for the Reliability of Embedded Systems

    Get PDF
    This paper considers the long-standing problem of conducting fault-injections experiments to establish the ultra-reliability of embedded systems. There have been extensive efforts in fault injection, and this paper offers a partial summary of the efforts, but these previous efforts have focused on realism and efficiency. Fault injections have been used to examine diagnostics and to test algorithms, but the literature does not contain any framework that says how to conduct fault-injection experiments to establish ultra-reliability. A solution to this problem integrates field-data, arguments-from-design, and fault-injection into a seamless whole. The solution in this paper is to derive a model reduction theorem for a class of semi-Markov models suitable for describing ultra-reliable embedded systems. The derivation shows that a tight upper bound on the probability of system failure can be obtained using only the means of system-recovery times, thus reducing the experimental effort to estimating a reasonable number of easily-observed parameters. The paper includes an example of a system subject to both permanent and transient faults. There is a discussion of integrating fault-injection with field-data and arguments-from-design

    Management of an intelligent argumentation network for a web-based collaborative engineering design environment

    Get PDF
    Conflict resolution is one of the most challenging tasks in collaborative engineering design. In the previous research, a web-based intelligent collaborative system was developed to address this challenge based on intelligent computational argumentation. However, two important issues were not resolved in that system: priority of participants and self-conflicting arguments. In this thesis, two methods are developed for incorporating priorities of participants into the computational argumentation network: 1) weighted summation and 2) re-assessment of strengths of arguments based on priority of owners of the argument using fuzzy logic inference. In addition, a method for detection of self-conflicting arguments was developed --Abstract, page iii

    Biologically inspired evolutionary temporal neural circuits

    Get PDF
    Biological neural networks have always motivated creation of new artificial neural networks, and in this case a new autonomous temporal neural network system. Among the more challenging problems of temporal neural networks are the design and incorporation of short and long-term memories as well as the choice of network topology and training mechanism. In general, delayed copies of network signals can form short-term memory (STM), providing a limited temporal history of events similar to FIR filters, whereas the synaptic connection strengths as well as delayed feedback loops (ER circuits) can constitute longer-term memories (LTM). This dissertation introduces a new general evolutionary temporal neural network framework (GETnet) through automatic design of arbitrary neural networks with STM and LTM. GETnet is a step towards realization of general intelligent systems that need minimum or no human intervention and can be applied to a broad range of problems. GETnet utilizes nonlinear moving average/autoregressive nodes and sub-circuits that are trained by enhanced gradient descent and evolutionary search in terms of architecture, synaptic delay, and synaptic weight spaces. The mixture of Lamarckian and Darwinian evolutionary mechanisms facilitates the Baldwin effect and speeds up the hybrid training. The ability to evolve arbitrary adaptive time-delay connections enables GETnet to find novel answers to many classification and system identification tasks expressed in the general form of desired multidimensional input and output signals. Simulations using Mackey-Glass chaotic time series and fingerprint perspiration-induced temporal variations are given to demonstrate the above stated capabilities of GETnet

    Applications and implementation of neuro-connectionist architectures.

    Get PDF
    by H.S. Ng.Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 91-97).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Introduction --- p.1Chapter 1.2 --- Neuro-connectionist Network --- p.2Chapter 2 --- Related Works --- p.5Chapter 2.1 --- Introduction --- p.5Chapter 2.1.1 --- Kruskal's Algorithm --- p.5Chapter 2.1.2 --- Prim's algorithm --- p.6Chapter 2.1.3 --- Sollin's algorithm --- p.7Chapter 2.1.4 --- Bellman-Ford algorithm --- p.8Chapter 2.1.5 --- Floyd-Warshall algorithm --- p.9Chapter 3 --- Binary Relation Inference Network and Path Problems --- p.11Chapter 3.1 --- Introduction --- p.11Chapter 3.2 --- Topology --- p.12Chapter 3.3 --- Network structure --- p.13Chapter 3.3.1 --- Single-destination BRIN architecture --- p.14Chapter 3.3.2 --- Comparison between all-pair BRIN and single-destination BRIN --- p.18Chapter 3.4 --- Path Problems and BRIN Solution --- p.18Chapter 3.4.1 --- Minimax path problems --- p.18Chapter 3.4.2 --- BRIN solution --- p.19Chapter 4 --- Analog and Voltage-mode Approach --- p.22Chapter 4.1 --- Introduction --- p.22Chapter 4.2 --- Analog implementation --- p.24Chapter 4.3 --- Voltage-mode approach --- p.26Chapter 4.3.1 --- The site function --- p.26Chapter 4.3.2 --- The unit function --- p.28Chapter 4.3.3 --- The computational unit --- p.28Chapter 4.4 --- Conclusion --- p.29Chapter 5 --- Current-mode Approach --- p.32Chapter 5.1 --- Introduction --- p.32Chapter 5.2 --- Current-mode approach for analog VLSI Implementation --- p.33Chapter 5.2.1 --- Site and Unit output function --- p.33Chapter 5.2.2 --- Computational unit --- p.34Chapter 5.2.3 --- A complete network --- p.35Chapter 5.3 --- Conclusion --- p.37Chapter 6 --- Neural Network Compensation for Optimization Circuit --- p.40Chapter 6.1 --- Introduction --- p.40Chapter 6.2 --- A Neuro-connectionist Architecture for error correction --- p.41Chapter 6.2.1 --- Linear Relationship --- p.42Chapter 6.2.2 --- Output Deviation of Computational Unit --- p.44Chapter 6.3 --- Experimental Results --- p.46Chapter 6.3.1 --- Training Phase --- p.46Chapter 6.3.2 --- Generalization Phase --- p.48Chapter 6.4 --- Conclusion --- p.50Chapter 7 --- Precision-limited Analog Neural Network Compensation --- p.51Chapter 7.1 --- Introduction --- p.51Chapter 7.2 --- Analog Neural Network hardware --- p.53Chapter 7.3 --- Integration of analog neural network compensation of connectionist net- work for general path problems --- p.54Chapter 7.4 --- Experimental Results --- p.55Chapter 7.4.1 --- Convergence time --- p.56Chapter 7.4.2 --- The accuracy of the system --- p.57Chapter 7.5 --- Conclusion --- p.58Chapter 8 --- Transitive Closure Problems --- p.60Chapter 8.1 --- Introduction --- p.60Chapter 8.2 --- Different ways of implementation of BRIN for transitive closure --- p.61Chapter 8.2.1 --- Digital Implementation --- p.61Chapter 8.2.2 --- Analog Implementation --- p.61Chapter 8.3 --- Transitive Closure Problem --- p.63Chapter 8.3.1 --- A special case of maximum spanning tree problem --- p.64Chapter 8.3.2 --- Analog approach solution for transitive closure problem --- p.65Chapter 8.3.3 --- Current-mode approach solution for transitive closure problem --- p.67Chapter 8.4 --- Comparisons between the different forms of implementation of BRIN for transitive closure --- p.71Chapter 8.4.1 --- Convergence Time --- p.71Chapter 8.4.2 --- Circuit complexity --- p.72Chapter 8.5 --- Discussion --- p.73Chapter 9 --- Critical path problems --- p.74Chapter 9.1 --- Introduction --- p.74Chapter 9.2 --- Problem statement and single-destination BRIN solution --- p.75Chapter 9.3 --- Analog implementation --- p.76Chapter 9.3.1 --- Separated building block --- p.78Chapter 9.3.2 --- Combined building block --- p.79Chapter 9.4 --- Current-mode approach --- p.80Chapter 9.4.1 --- "Site function, unit output function and a completed network" --- p.80Chapter 9.5 --- Conclusion --- p.83Chapter 10 --- Conclusions --- p.85Chapter 10.1 --- Summary of Achievements --- p.85Chapter 10.2 --- Future development --- p.88Chapter 10.2.1 --- Application for financial problems --- p.88Chapter 10.2.2 --- Fabrication of VLSI Implementation --- p.88Chapter 10.2.3 --- Actual prototyping of Analog Integrated Circuits for critical path and transitive closure problems --- p.89Chapter 10.2.4 --- Other implementation platform --- p.89Chapter 10.2.5 --- On-line update of routing table inside the router for network com- munication using BRIN --- p.89Chapter 10.2.6 --- Other BRIN's applications --- p.90Bibliography --- p.9

    Applications and implementation of neuro-connectionist architectures.

    Get PDF
    by H.S. Ng.Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 91-97).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Introduction --- p.1Chapter 1.2 --- Neuro-connectionist Network --- p.2Chapter 2 --- Related Works --- p.5Chapter 2.1 --- Introduction --- p.5Chapter 2.1.1 --- Kruskal's Algorithm --- p.5Chapter 2.1.2 --- Prim's algorithm --- p.6Chapter 2.1.3 --- Sollin's algorithm --- p.7Chapter 2.1.4 --- Bellman-Ford algorithm --- p.8Chapter 2.1.5 --- Floyd-Warshall algorithm --- p.9Chapter 3 --- Binary Relation Inference Network and Path Problems --- p.11Chapter 3.1 --- Introduction --- p.11Chapter 3.2 --- Topology --- p.12Chapter 3.3 --- Network structure --- p.13Chapter 3.3.1 --- Single-destination BRIN architecture --- p.14Chapter 3.3.2 --- Comparison between all-pair BRIN and single-destination BRIN --- p.18Chapter 3.4 --- Path Problems and BRIN Solution --- p.18Chapter 3.4.1 --- Minimax path problems --- p.18Chapter 3.4.2 --- BRIN solution --- p.19Chapter 4 --- Analog and Voltage-mode Approach --- p.22Chapter 4.1 --- Introduction --- p.22Chapter 4.2 --- Analog implementation --- p.24Chapter 4.3 --- Voltage-mode approach --- p.26Chapter 4.3.1 --- The site function --- p.26Chapter 4.3.2 --- The unit function --- p.28Chapter 4.3.3 --- The computational unit --- p.28Chapter 4.4 --- Conclusion --- p.29Chapter 5 --- Current-mode Approach --- p.32Chapter 5.1 --- Introduction --- p.32Chapter 5.2 --- Current-mode approach for analog VLSI Implementation --- p.33Chapter 5.2.1 --- Site and Unit output function --- p.33Chapter 5.2.2 --- Computational unit --- p.34Chapter 5.2.3 --- A complete network --- p.35Chapter 5.3 --- Conclusion --- p.37Chapter 6 --- Neural Network Compensation for Optimization Circuit --- p.40Chapter 6.1 --- Introduction --- p.40Chapter 6.2 --- A Neuro-connectionist Architecture for error correction --- p.41Chapter 6.2.1 --- Linear Relationship --- p.42Chapter 6.2.2 --- Output Deviation of Computational Unit --- p.44Chapter 6.3 --- Experimental Results --- p.46Chapter 6.3.1 --- Training Phase --- p.46Chapter 6.3.2 --- Generalization Phase --- p.48Chapter 6.4 --- Conclusion --- p.50Chapter 7 --- Precision-limited Analog Neural Network Compensation --- p.51Chapter 7.1 --- Introduction --- p.51Chapter 7.2 --- Analog Neural Network hardware --- p.53Chapter 7.3 --- Integration of analog neural network compensation of connectionist net- work for general path problems --- p.54Chapter 7.4 --- Experimental Results --- p.55Chapter 7.4.1 --- Convergence time --- p.56Chapter 7.4.2 --- The accuracy of the system --- p.57Chapter 7.5 --- Conclusion --- p.58Chapter 8 --- Transitive Closure Problems --- p.60Chapter 8.1 --- Introduction --- p.60Chapter 8.2 --- Different ways of implementation of BRIN for transitive closure --- p.61Chapter 8.2.1 --- Digital Implementation --- p.61Chapter 8.2.2 --- Analog Implementation --- p.61Chapter 8.3 --- Transitive Closure Problem --- p.63Chapter 8.3.1 --- A special case of maximum spanning tree problem --- p.64Chapter 8.3.2 --- Analog approach solution for transitive closure problem --- p.65Chapter 8.3.3 --- Current-mode approach solution for transitive closure problem --- p.67Chapter 8.4 --- Comparisons between the different forms of implementation of BRIN for transitive closure --- p.71Chapter 8.4.1 --- Convergence Time --- p.71Chapter 8.4.2 --- Circuit complexity --- p.72Chapter 8.5 --- Discussion --- p.73Chapter 9 --- Critical path problems --- p.74Chapter 9.1 --- Introduction --- p.74Chapter 9.2 --- Problem statement and single-destination BRIN solution --- p.75Chapter 9.3 --- Analog implementation --- p.76Chapter 9.3.1 --- Separated building block --- p.78Chapter 9.3.2 --- Combined building block --- p.79Chapter 9.4 --- Current-mode approach --- p.80Chapter 9.4.1 --- "Site function, unit output function and a completed network" --- p.80Chapter 9.5 --- Conclusion --- p.83Chapter 10 --- Conclusions --- p.85Chapter 10.1 --- Summary of Achievements --- p.85Chapter 10.2 --- Future development --- p.88Chapter 10.2.1 --- Application for financial problems --- p.88Chapter 10.2.2 --- Fabrication of VLSI Implementation --- p.88Chapter 10.2.3 --- Actual prototyping of Analog Integrated Circuits for critical path and transitive closure problems --- p.89Chapter 10.2.4 --- Other implementation platform --- p.89Chapter 10.2.5 --- On-line update of routing table inside the router for network com- munication using BRIN --- p.89Chapter 10.2.6 --- Other BRIN's applications --- p.90Bibliography --- p.9

    Assinalamentos de testes para um algoritmo de diagnóstico em nível de sistema para redes de sensores sem fio

    Get PDF
    Resumo: Este trabalho se propõe a comparar três abordagens de construção de assinalamentos de testes para um algoritmo de diagnóstico em nível de sistema. As abordagens apresentadas visam o problema da detecção de alarmes falsos (falsos positivos) em uma rede de sensores sem ó onde os sensores monitoram o ambiente com o objetivo de gerar alarmes sobre a ocorrência de determinados eventos. Considere uma rede de sensores onde um conjunto de t sensores próximos geograficamente enviam sinais de alarme a uma unidade central da rede, com maior capacidade de processamento, chamada sink, informando a detecção de determinado fenômeno. Para garantir que os alarmes gerados não são falsos, o sink solicita a execução de testes mútuos entre os sensores presentes na região que contém os nodos que reportaram os alarmes. O resultado dos testes é enviado ao sink que, então, utiliza um algoritmo de diagnóstico em nível de sistema para identificar os sensores falhos. O algoritmo de diagnóstico é bem sucedido na execução desta tarefa se os testes executados pelos sensores são suficientes para alcançar determinada diagnosticabilidade do sistema, a qual depende de propriedades topológicas da rede de sensores e de certas condições presentes na literatura para formar assinalamentos de teste t-diagnosticáveis. Este trabalho apresenta três estratégias de testes que asseguram que a iagnosticabilidade desejada para o sistema seja alcançada com um consumo minimizado de energia. Resultados experimentais avaliam o comportamento das estratégias e comparam o consumo de energia apresentado entre elas em redes com diferentes topologias e densidades, com diferentes valores de t e com variações na distância entre os sensores que geram alarmes

    Conception d’un circuit de lecture d’une matrice de photodiodes à avalanche monophotonique pour les détecteurs de physique des particules dans les gaz nobles liquéfiés

    Get PDF
    Les détecteurs aux gaz nobles liquéfiés prennent une plus grande part dans les expériences de physique des particules. Le photomultiplicateur en silicium (SiPM ) devient le photodé- tecteur d’excellence pour détecter la lumière de scintillation dans les liquides cryogéniques. Pour répondre aux questions de la physique moderne, des expériences comme le next En- riched Xenon Observatory (nEXO) étudient les neutrinos en tentant d’observer la double désintégration bêta sans neutrinos. D’autres collaborations focalisent leurs travaux sur la matière noire en examinant diverses signatures dans l’argon. La réalisation de ces détec- teurs présente plusieurs défis de conception. Par exemple, la radioactivité des matériaux utilisés doit être contrôlée pour limiter les scintillations parasites. De plus, leur grande sur- face requiert une électronique d’instrumentation in situ. Mais, l’utilisation des SiPM et de leur circuit de lecture dans les liquides nobles limite la puissance permise pour en éviter l’ébullition. Malgré leurs atouts, ces SiPM nécessitent, pour fonctionner, une chaîne de lecture composée d’un préamplificateur suivi de filtrage et d’un convertisseur analogique- numérique. Ces circuits peuvent s’avérer énergivores et plusieurs compromis en diminuent, par exemple, les performances temporelles ou le rapport signal sur bruit. En tirant avantage de la nature binaire des photodiodes à avalanche monophotoniques (SPAD) qui composent les SiPM, ces travaux présentent un nouveau circuit numérique de lecture d’une matrice de SPAD à faible consommation. Il est dédié à instrumenter des expériences de physique des particules à grande surface dans les gaz nobles liquéfiés. Un nouveau procédé de SPAD, actuellement en développement, sera collé sur cette électro- nique grâce à un assemblage vertical en trois dimensions (3D). La puce interface 4096 SPAD répartis dans une superficie de 25 mm 2 . La surface totale de la puce mesure 31 mm 2 , ce qui résulte en un facteur de remplissage de plus de 80 %. Des SPAD intégrés en deux dimensions à même le circuit intégré permettent de le tester sans attendre le développement des SPAD sur mesure et de l’assemblage en trois dimensions. Trois sorties fournissent des informations complémentaires. D’abord, une sortie d’inter- ruption (flag) avec une résolution temporelle inférieure à 90 ps RMS indique la présence de photons. Puis, une somme numérique donne la quantité détectée. Elle peut opérer jus- qu’à 100 MHz. Enfin, une somme analogique en courant vient valider les deux premières sorties. Cette puce asynchrone peut fonctionner avec une horloge intermittente. Dans le contexte de l’expérience nEXO, en tenant compte du taux d’événements, sa consommation de puissance moyenne atteint 140 μW. Suite aux étapes de caractérisation, la première révision de ce photodétecteur novateur répond aux différentes exigences. De légères imperfections persistent, mais une prochaine révision permettra de facilement corriger ces dernières. Ce convertisseur photon-numérique proposera donc une alternative prometteuse aux SiPM analogiques
    corecore