746 research outputs found

    Adaptive Intelligent Systems for Extreme Environments

    Get PDF
    As embedded processors become powerful, a growing number of embedded systems equipped with artificial intelligence (AI) algorithms have been used in radiation environments to perform routine tasks to reduce radiation risk for human workers. On the one hand, because of the low price, commercial-off-the-shelf devices and components are becoming increasingly popular to make such tasks more affordable. Meanwhile, it also presents new challenges to improve radiation tolerance, the capability to conduct multiple AI tasks and deliver the power efficiency of the embedded systems in harsh environments. There are three aspects of research work that have been completed in this thesis: 1) a fast simulation method for analysis of single event effect (SEE) in integrated circuits, 2) a self-refresh scheme to detect and correct bit-flips in random access memory (RAM), and 3) a hardware AI system with dynamic hardware accelerators and AI models for increasing flexibility and efficiency. The variances of the physical parameters in practical implementation, such as the nature of the particle, linear energy transfer and circuit characteristics, may have a large impact on the final simulation accuracy, which will significantly increase the complexity and cost in the workflow of the transistor level simulation for large-scale circuits. It makes it difficult to conduct SEE simulations for large-scale circuits. Therefore, in the first research work, a new SEE simulation scheme is proposed, to offer a fast and cost-efficient method to evaluate and compare the performance of large-scale circuits which subject to the effects of radiation particles. The advantages of transistor and hardware description language (HDL) simulations are combined here to produce accurate SEE digital error models for rapid error analysis in large-scale circuits. Under the proposed scheme, time-consuming back-end steps are skipped. The SEE analysis for large-scale circuits can be completed in just few hours. In high-radiation environments, bit-flips in RAMs can not only occur but may also be accumulated. However, the typical error mitigation methods can not handle high error rates with low hardware costs. In the second work, an adaptive scheme combined with correcting codes and refreshing techniques is proposed, to correct errors and mitigate error accumulation in extreme radiation environments. This scheme is proposed to continuously refresh the data in RAMs so that errors can not be accumulated. Furthermore, because the proposed design can share the same ports with the user module without changing the timing sequence, it thus can be easily applied to the system where the hardware modules are designed with fixed reading and writing latency. It is a challenge to implement intelligent systems with constrained hardware resources. In the third work, an adaptive hardware resource management system for multiple AI tasks in harsh environments was designed. Inspired by the “refreshing” concept in the second work, we utilise a key feature of FPGAs, partial reconfiguration, to improve the reliability and efficiency of the AI system. More importantly, this feature provides the capability to manage the hardware resources for deep learning acceleration. In the proposed design, the on-chip hardware resources are dynamically managed to improve the flexibility, performance and power efficiency of deep learning inference systems. The deep learning units provided by Xilinx are used to perform multiple AI tasks simultaneously, and the experiments show significant improvements in power efficiency for a wide range of scenarios with different workloads. To further improve the performance of the system, the concept of reconfiguration was further extended. As a result, an adaptive DL software framework was designed. This framework can provide a significant level of adaptability support for various deep learning algorithms on an FPGA-based edge computing platform. To meet the specific accuracy and latency requirements derived from the running applications and operating environments, the platform may dynamically update hardware and software (e.g., processing pipelines) to achieve better cost, power, and processing efficiency compared to the static system

    Multiple Objective Co-Optimization of Switched Reluctance Machine Design and Control

    Get PDF
    This dissertation includes a review of various motor types, a motivation for selecting the switched reluctance motor (SRM) as a focus of this work, a review of SRM design and control optimization methods in literature, a proposed co-optimization approach, and empirical evaluations to validate the models and proposed co-optimization methods. The switched reluctance motor (SRM) was chosen as a focus of research based on its low cost, easy manufacturability, moderate performance and efficiency, and its potential for improvement through advanced design and control optimization. After a review of SRM design and control optimization methods in the literature, it was found that co-optimization of both SRM design and controls is not common, and key areas for improvement in methods for optimizing SRM design and control were identified. Among many things, this includes the need for computationally efficient transient models with the accuracy of FEA simulations and the need for co-optimization of both machine geometry and control methods throughout the entire operation range with multiple objectives such as torque ripple, efficiency, etc. A modeling and optimization framework with multiple stages is proposed that includes robust transient simulators that use mappings from FEA in order to optimize SRM geometry, windings, and control conditions throughout the entire operation region with multiple objectives. These unique methods include the use of particle swarm optimization to determine current profiles for low to moderate speeds and other optimization methods to determine optimal control conditions throughout the entire operation range with consideration of various characteristics and boundary conditions such as voltage and current constraints. This multi-stage optimization process includes down-selections in two previous stages based on performance and operational characteristics at zero and maximum speed. Co-optimization of SRM design and control conditions is demonstrated as a final design is selected based on a fitness function evaluating various operational characteristics including torque ripple and efficiency throughout the torque-speed operation range. The final design was scaled, fabricated, and tested to demonstrate the viability of the proposed framework and co-optimization method. Accuracy of the models was confirmed by comparing simulated and empirical results. Test results from operation at various torques and speeds demonstrates the effectiveness of the optimization approach throughout the entire operating range. Furthermore, test results confirm the feasibility of the proposed torque ripple minimization and efficiency maximization control schemes. A key benefit of the overall proposed approach is that a wide range of machine design parameters and control conditions can be swept, and based on the needs of an application, the designer can select the appropriate geometry, winding, and control approach based on various performance functions that consider torque ripple, efficiency, and other metrics

    Applying Hypervisor-Based Fault Tolerance Techniques to Safety-Critical Embedded Systems

    Get PDF
    This document details the work conducted through the development of this thesis, and it is structured as follows: • Chapter 1, Introduction, has briefly presented the motivation, objectives, and contributions of this thesis. • Chapter 2, Fundamentals, exposes a series of concepts that are necessary to correctly understand the information presented in the rest of the thesis, such as the concepts of virtualization, hypervisors, or software-based fault tolerance. In addition, this chapter includes an exhaustive review and comparison between the different hypervisors used in scientific studies dealing with safety-critical systems, and a brief review of some works that try to improve fault tolerance in the hypervisor itself, an area of research that is outside the scope of this work, but that complements the mechanism presented and could be established as a line of future work. • Chapter 3, Problem Statement and Related Work, explains the main reasons why the concept of Hypervisor-Based Fault Tolerance was born and reviews the main articles and research papers on the subject. This review includes both papers related to safety-critical embedded systems (such as the research carried out in this thesis) and papers related to cloud servers and cluster computing that, although not directly applicable to embedded systems, may raise useful concepts that make our solution more complete or allow us to establish future lines of work. • Chapter 4, Proposed Solution, begins with a brief comparison of the work presented in Chapter 3 to establish the requirements that our solution must meet in order to be as complete and innovative as possible. It then sets out the architecture of the proposed solution and explains in detail the two main elements of the solution: the Voter and the Health Monitoring partition. • Chapter 5, Prototype, explains in detail the prototyping of the proposed solution, including the choice of the hypervisor, the processing board, and the critical functionality to be redundant. With respect to the voter, it includes prototypes for both the software version (the voter is implemented in a virtual machine) and the hardware version (the voter is implemented as IP cores on the FPGA). • Chapter 6, Evaluation, includes the evaluation of the prototype developed in Chapter 5. As a preliminary step and given that there is no evidence in this regard, an exercise is carried out to measure the overhead involved in using the XtratuM hypervisor versus not using it. Subsequently, qualitative tests are carried out to check that Health Monitoring is working as expected and a fault injection campaign is carried out to check the error detection and correction rate of our solution. Finally, a comparison is made between the performance of the hardware and software versions of Voter. • Chapter 7, Conclusions and Future Work, is dedicated to collect the conclusions obtained and the contributions made during the research (in the form of articles in journals, conferences and contributions to projects and proposals in the industry). In addition, it establishes some lines of future work that could complete and extend the research carried out during this doctoral thesis.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Katzalin Olcoz Herrero.- Secretario: Félix García Carballeira.- Vocal: Santiago Rodríguez de la Fuent

    Modelling and Simulation for Power Distribution Grids of 3D Tiled Computing Arrays

    Get PDF
    This thesis presents modelling and simulation developments for power distribution grids of 3D tiled computing arrays (TCAs), a novel type of paradigm for HPC systems, and tests the feasibility of such systems for HPC systems domains. The exploration of a complex power-grid such as those found in the TCA concept requires detailed simulations of systems with hundreds and possibly thousands of modular nodes, each contributing to the collective behaviour of the system. In particular power, voltage, and current behaviours are critically important observations. To facilitate this investigation, and test the hypothesis, which seeks to understand if scalability is feasible for such systems, a bespoke simulation platform has been developed, and (importantly) validated against hardware prototypes of small systems. A number of systems are simulated, including systems consisting of arrays of ’balls’. Balls are collections of modular tiles that form a ball-like modular unit, and can then themselves be tiled into large scale systems. Evaluations typically involved simulation of cubic arrays of sizes ranging from 2x2x2 balls up to 10x10x10. Larger systems require extended simulation times. Therefore models are developed to extrapolate system behaviours for higher-orders of systems and to gauge the ultimate scalability of such TCA systems. It is found that systems of 40x40x40 are quite feasible with appropriate configurations. Data connectivity is explored to a lesser degree, but comparisons were made between TCA systems and well known comparable HPC systems, and it is concluded that TCA systems can be built with comparable data-flow and scalability, and that the electrical and engineering challenges associated with the novelty of 3D tiled systems can be met with practical solutions

    FPGA-Based Hardware Accelerators for Deep Learning in Mobile Robotics

    Get PDF
    The increasing demand for real-time low-power hardware processing systems, endowed with the capacity to perform compute-intensive applications, accentuated the inadequacy of the conventional architecture of multicore general-purpose processors. In an effort to meet this demand, edge computing hardware accelerators have come to the forefront, notably with regard to deep learning and robotic systems. This thesis explores preeminent hardware accelerators and examines the performance, accuracy, and power consumption of a GPU and an FPGA-based platform, both specifically designed for edge computing applications. The experiments were conducted using three deep neural network models, namely AlexNet, GoogLeNet, and ResNet-18, trained to perform binary image classification in a known environment. Our results demonstrate that the FPGA-based platform, particularly a Kria KV260 Vision AI starter kit, exhibited an inference speed of up to nine and a half times faster than that of the GPU-based Jetson Nano developer kit. Additionally, the empirical findings of this work reported as much as a quintuple efficiency over the Jetson Nano in terms of inference speed per watt with a mere 5.4\% drop in accuracy caused by the quantization process required by the FPGA. However, the Jetson Nano showed a 1.6 times faster inference rate with the AlexNet model over the KV260 and its deployment process proved to be less challenging

    Vitruvius+: An area-efficient RISC-V decoupled vector coprocessor for high performance computing applications

    Get PDF
    The maturity level of RISC-V and the availability of domain-specific instruction set extensions, like vector processing, make RISC-V a good candidate for supporting the integration of specialized hardware in processor cores for the High Performance Computing (HPC) application domain. In this article,1 we present Vitruvius+, the vector processing acceleration engine that represents the core of vector instruction execution in the HPC challenge that comes within the EuroHPC initiative. It implements the RISC-V vector extension (RVV) 0.7.1 and can be easily connected to a scalar core using the Open Vector Interface standard. Vitruvius+ natively supports long vectors: 256 double precision floating-point elements in a single vector register. It is composed of a set of identical vector pipelines (lanes), each containing a slice of the Vector Register File and functional units (one integer, one floating point). The vector instruction execution scheme is hybrid in-order/out-of-order and is supported by register renaming and arithmetic/memory instruction decoupling. On a stand-alone synthesis, Vitruvius+ reaches a maximum frequency of 1.4 GHz in typical conditions (TT/0.80V/25°C) using GlobalFoundries 22FDX FD-SOI. The silicon implementation has a total area of 1.3 mm2 and maximum estimated power of ~920 mW for one instance of Vitruvius+ equipped with eight vector lanes.This research has received funding from the European High Performance Computing Joint Undertaking (JU) under Framework Partnership Agreement No 800928 (European Processor Initiative) and Specific Grant Agreement No 101036168 (EPI SGA2). The JU receives support from the European Union’s Horizon 2020 research and innovation programme and from Croatia, France, Germany, Greece, Italy, Netherlands, Portugal, Spain, Sweden, and Switzerland. The EPI-SGA2 project, PCI2022-132935 is also co-funded by MCIN/AEI/10.13039/501100011033 and by the UE NextGen- erationEU/PRTR. This work has also been partially supported by the Spanish Ministry of Science and Innovation (PID2019-107255GB-C21/AEI/10.13039/501100011033).Peer ReviewedPostprint (author's final draft

    Fine-grained Haptics: Sensing and Actuating Haptic Primary Colours (force, vibration, and temperature)

    Get PDF
    This thesis discusses the development of a multimodal, fine-grained visual-haptic system for teleoperation and robotic applications. This system is primarily composed of two complementary components: an input device known as the HaptiTemp sensor (combines “Haptics” and “Temperature”), which is a novel thermosensitive GelSight-like sensor, and an output device, an untethered multimodal finegrained haptic glove. The HaptiTemp sensor is a visuotactile sensor that can sense haptic primary colours known as force, vibration, and temperature. It has novel switchable UV markers that can be made visible using UV LEDs. The switchable markers feature is a real novelty of the HaptiTemp because it can be used in the analysis of tactile information from gel deformation without impairing the ability to classify or recognise images. The use of switchable markers in the HaptiTemp sensor is the solution to the trade-off between marker density and capturing high-resolution images using one sensor. The HaptiTemp sensor can measure vibrations by counting the number of blobs or pulses detected per unit time using a blob detection algorithm. For the first time, temperature detection was incorporated into a GelSight-like sensor, making the HaptiTemp sensor a haptic primary colours sensor. The HaptiTemp sensor can also do rapid temperature sensing with a 643 ms response time for the 31°C to 50°C temperature range. This fast temperature response of the HaptiTemp sensor is comparable to the withdrawal reflex response in humans. This is the first time a sensor can trigger a sensory impulse that can mimic a human reflex in the robotic community. The HaptiTemp sensor can also do simultaneous temperature sensing and image classification using a machine vision camera—the OpenMV Cam H7 Plus. This capability of simultaneous sensing and image classification has not been reported or demonstrated by any tactile sensor. The HaptiTemp sensor can be used in teleoperation because it can communicate or transmit tactile analysis and image classification results using wireless communication. The HaptiTemp sensor is the closest thing to the human skin in tactile sensing, tactile pattern recognition, and rapid temperature response. In order to feel what the HaptiTemp sensor is touching from a distance, a corresponding output device, an untethered multimodal haptic hand wearable, is developed to actuate the haptic primary colours sensed by the HaptiTemp sensor. This wearable can communicate wirelessly and has fine-grained cutaneous feedback to feel the edges or surfaces of the tactile images captured by the HaptiTemp sensor. This untethered multimodal haptic hand wearable has gradient kinesthetic force feedback that can restrict finger movements based on the force estimated by the HaptiTemp sensor. A retractable string from an ID badge holder equipped with miniservos that control the stiffness of the wire is attached to each fingertip to restrict finger movements. Vibrations detected by the HaptiTemp sensor can be actuated by the tapping motion of the tactile pins or by a buzzing minivibration motor. There is also a tiny annular Peltier device, or ThermoElectric Generator (TEG), with a mini-vibration motor, forming thermo-vibro feedback in the palm area that can be activated by a ‘hot’ or ‘cold’ signal from the HaptiTemp sensor. The haptic primary colours can also be embedded in a VR environment that can be actuated by the multimodal hand wearable. A VR application was developed to demonstrate rapid tactile actuation of edges, allowing the user to feel the contours of virtual objects. Collision detection scripts were embedded to activate the corresponding actuator in the multimodal haptic hand wearable whenever the tactile matrix simulator or hand avatar in VR collides with a virtual object. The TEG also gets warm or cold depending on the virtual object the participant has touched. Tests were conducted to explore virtual objects in 2D and 3D environments using Leap Motion control and a VR headset (Oculus Quest 2). Moreover, a fine-grained cutaneous feedback was developed to feel the edges or surfaces of a tactile image, such as the tactile images captured by the HaptiTemp sensor, or actuate tactile patterns in 2D or 3D virtual objects. The prototype is like an exoskeleton glove with 16 tactile actuators (tactors) on each fingertip, 80 tactile pins in total, made from commercially available P20 Braille cells. Each tactor can be controlled individually to enable the user to feel the edges or surfaces of images, such as the high-resolution tactile images captured by the HaptiTemp sensor. This hand wearable can be used to enhance the immersive experience in a virtual reality environment. The tactors can be actuated in a tapping manner, creating a distinct form of vibration feedback as compared to the buzzing vibration produced by a mini-vibration motor. The tactile pin height can also be varied, creating a gradient of pressure on the fingertip. Finally, the integration of the high-resolution HaptiTemp sensor, and the untethered multimodal, fine-grained haptic hand wearable is presented, forming a visuotactile system for sensing and actuating haptic primary colours. Force, vibration, and temperature sensing tests with corresponding force, vibration, and temperature actuating tests have demonstrated a unified visual-haptic system. Aside from sensing and actuating haptic primary colours, touching the edges or surfaces of the tactile images captured by the HaptiTemp sensor was carried out using the fine-grained cutaneous feedback of the haptic hand wearable

    Artificial Intelligence for the Electron Ion Collider (AI4EIC)

    Full text link
    The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. This paper summarizes the different activities and R&D projects covered across the sessions of the workshop and provides an overview of the goals, approaches and strategies regarding AI/ML in the EIC community, as well as cutting-edge techniques currently studied in other experiments.Comment: 27 pages, 11 figures, AI4EIC workshop, tutorials and hackatho

    Biofuels Production and Processing Technology

    Get PDF
    The negative impacts of global warming and global environmental pollution due to fossil fuels mean that the main challenge of modern society is finding alternatives to conventional fuels. In this scenario, biofuels derived from renewable biomass represent the most promising renewable energy sources. Depending on the biomass used by the fermentation technologies, it is possible to obtain first-generation biofuels produced from food crops, second-generation biofuels produced from non-food feedstock, mainly starting from renewable lignocellulosic biomasses, and third-generation biofuels, represented by algae or food waste biomass.Although biofuels appear to be the closest alternative to fossil fuels, it is necessary for them to be produced in competitive quantities and costs, requiring both improvements to production technologies and the diversification of feedstock. This Special Issue is focused on technological innovations, including the utilization of different feedstocks, with a particular focus on biethanol production from food waste; different biomass pretreatments; fermentation strategies, such as simultaneous saccharification and fermentation (SSF) or separate hydrolysis and fermentation (SHF); different applied microorganisms used as a monoculture or co-culture; and different setups for biofuel fermentation processes.The manuscripts collected represent a great opportunity for adding new knowledge to the scientific community as well as industry

    Practical synthesis from real-world oracles

    Get PDF
    As software systems become increasingly heterogeneous, the ability of compilers to reason about an entire system has decreased. When components of a system are not implemented as traditional programs, but rather as specialised hardware, optimised architecture-specific libraries, or network services, the compiler is unable to cross these abstraction barriers and analyse the system as a whole. If these components could be modelled or understood as programs, then the compiler would be able to reason about their behaviour without concern for their internal implementation details: a homogeneous view of the entire system would be afforded. However, it is not often the case that such components ever corresponded to an original program. This means that to facilitate this homogenenous analysis, programmatic models of component behaviour must be learned or constructed automatically. Constructing these models is an inductive program synthesis problem, albeit a challenging one that is largely beyond the ability of existing implementations. In order for the problem to be made tractable, information provided by the underlying context (i.e. the real component behaviour to be matched) must be integrated. This thesis presents three program synthesis approaches that integrate contextual information to synthesise programmatic models for real, existing components. The first, Annote, exploits informally-encoded information about a component's interface (e.g. from documentation) by weaving that information into an extended type-and-attribute system for component interfaces. The second, Presyn, learns a pair of cooperating probabilistic models from prior syntheses, that aim to predict likely program structure based on a component's interface. Finally, Haze uses observations of common side-effects of component executions to bias the search for programs. These approaches are each evaluated against comparable synthesisers from the literature, on a set of benchmark problems derived from real components. Learning models for component behaviour is only a partial solution; the compiler must also have some mechanism to use those models for program analysis and transformation. This thesis additionally proposes a novel mechanism for context-sensitive automatic API migration based on synthesised programmatic models, and evaluates the effectiveness of doing so on real application code. In summary, this thesis proposes a new framing for program synthesis problems that target the behaviour of real components, and demonstrates three different potential approaches to synthesis in this spirit. The success of these approaches is evaluated against implementations from the literature, and their results used to drive a novel API migration technique
    • …
    corecore