349 research outputs found

    Embedded System Optimization of Radar Post-processing in an ARM CPU Core

    Get PDF
    Algorithms executed on the radar processor system contributes to a significant performance bottleneck of the overall radar system. One key performance concern is the latency in target detection when dealing with hard deadline systems. Research has shown software optimization as one major contributor to radar system performance improvements. This thesis aims at software optimizations using a manual and automatic approach and analyzing the results to make informed future decisions while working with an ARM processor system. In order to ascertain an optimized implementation, a question put forward was whether the algorithms on the ARM processor could work with a 6-antenna implementation without a decline in the performance. However, an answer would also help project how many additional algorithms can still be added without performance decline. The manual optimization was done based on the quantitative analysis of the software execution time. The manual optimization approach looked at the vectorization strategy using the NEON vector register on the ARM CPU to reimplement the initial Constant False Alarm Rate(CFAR) Detection algorithm. An additional optimization approach was eliminating redundant loops while going through the Range Gates and Doppler filters. In order to determine the best compiler for automatic code optimization for the radar algorithms on the ARM processor, the GCC and Clang compilers were used to compile the initial algorithms and the optimized implementation on the radar post-processing stage. Analysis of the optimization results showed that it is possible to run the radar post-processing algorithms on the ARM processor at the 6-antenna implementation without system load stress. In addition, the results show an excellent headroom margin based on the defined scenario. The result analysis further revealed that the effect of dynamic memory allocation could not be underrated in situations where performance is a significant concern. Additional statements from the result demonstrated that the GCC and Clang compiler has their strength and weaknesses when used in the compilation. One limiting factor to note on the optimization using the NEON register is the sample size’s effect on the optimization implementation. Although it fits into the test samples used based on the defined scenario, there might be varying results in varying window cell size situations that might not necessarily improve the time constraints

    A comprehensive approach to MPSoC security: achieving network-on-chip security : a hierarchical, multi-agent approach

    Get PDF
    Multiprocessor Systems-on-Chip (MPSoCs) are pervading our lives, acquiring ever increasing relevance in a large number of applications, including even safety-critical ones. MPSoCs, are becoming increasingly complex and heterogeneous; the Networks on Chip (NoC paradigm has been introduced to support scalable on-chip communication, and (in some cases) even with reconfigurability support. The increased complexity as well as the networking approach in turn make security aspects more critical. In this work we propose and implement a hierarchical multi-agent approach providing solutions to secure NoC based MPSoCs at different levels of design. We develop a flexible, scalable and modular structure that integrates protection of different elements in the MPSoC (e.g. memory, processors) from different attack scenarios. Rather than focusing on protection strategies specifically devised for an individual attack or a particular core, this work aims at providing a comprehensive, system-level protection strategy: this constitutes its main methodological contribution. We prove feasibility of the concepts via prototype realization in FPGA technology

    A survey on run-time power monitors at the edge

    Get PDF
    Effectively managing energy and power consumption is crucial to the success of the design of any computing system, helping mitigate the efficiency obstacles given by the downsizing of the systems while also being a valuable step towards achieving green and sustainable computing. The quality of energy and power management is strongly affected by the prompt availability of reliable and accurate information regarding the power consumption for the different parts composing the target monitored system. At the same time, effective energy and power management are even more critical within the field of devices at the edge, which exponentially proliferated within the past decade with the digital revolution brought by the Internet of things. This manuscript aims to provide a comprehensive conceptual framework to classify the different approaches to implementing run-time power monitors for edge devices that appeared in literature, leading the reader toward the solutions that best fit their application needs and the requirements and constraints of their target computing platforms. Run-time power monitors at the edge are analyzed according to both the power modeling and monitoring implementation aspects, identifying specific quality metrics for both in order to create a consistent and detailed taxonomy that encompasses the vast existing literature and provides a sound reference to the interested reader

    Aqueous organic redox-flow-batteries: from electrolyte development to detailed stability studies

    Get PDF
    To reach the political aims of the Kyoto Protocol and the Paris Agreement, the energy production must change from fossil fuels to sustainable sources like water, sun, and wind. But the fluctuation output of these techniques must be compensated by a smart grid. Herein, the integration of large-scale energy storage systems is inevitable. The possibility to use cheap organic molecules in “green” and non-toxic electrolytes make the emerging technology of organic redox flow batteries a promising candidate for this mission. Nevertheless, this battery concept suffers from many problems. One of the major ones is the insufficient stability of the active materials. To overcome this technical teething trouble, a combination of material and sensor development to investigate and mitigate decomposition processes is required. But currently, the research is more focused on development of new redox-active molecules with intrinsically higher stabilities than on investigating, understanding, and improving the underlying processes. Beside some exceptions, most research groups concentrate on demonstrating stabilities by cycling the material as often as possible. In addition, the influence of the used SOC of the battery is mostly uninvestigated. A battery management system which might limit the used SOC could improve the device lifespan. Therefore, precise, fast, and cheap monitoring systems for SOC and SOH determination are needed. The commonly applied techniques suffer from major drawbacks, like expensive equipment, material limitations, temperature dependency, and the need for (re)calibrations to name only a few. A protocol that is capable of monitoring these parameters on the electrolyte level and not on the battery level could unlock a deeper understanding of the ongoing processes inside the device or electrolyte. Possible device lifespan improvements could then not only be addressed by the material itself but also by the cycling conditions or other external factors

    Hardware acceleration using FPGAs for adaptive radiotherapy

    Get PDF
    Adaptive radiotherapy (ART) seeks to improve the accuracy of radiotherapy by adapting the treatment based on up-to-date images of the patient's anatomy captured at the time of treatment delivery. The amount of image data, combined with the clinical time requirements for ART, necessitates automatic image analysis to adapt the treatment plan. Currently, the computational effort of the image processing and plan adaptation means they cannot be completed in a clinically acceptable timeframe. This thesis aims to investigate the use of hardware acceleration on Field Programmable Gate Arrays (FPGAs) to accelerate algorithms for segmenting bony anatomy in Computed Tomography (CT) scans, to reduce the plan adaptation time for ART. An assessment was made of the overhead incurred by transferring image data to an FPGA-based hardware accelerator using the industry-standard DICOM protocol over an Ethernet connection. The rate was found to be likely to limit the performanceof hardware accelerators for ART, highlighting the need for an alternative method of integrating hardware accelerators with existing radiotherapy equipment. A clinically-validated segmentation algorithm was adapted for implementation in hardware. This was shown to process three-dimensional CT images up to 13.81 times faster than the original software implementation. The segmentations produced by the two implementations showed strong agreement. Modifications to the hardware implementation were proposed for segmenting fourdimensional CT scans. This was shown to process image volumes 14.96 times faster than the original software implementation, and the segmentations produced by the two implementations showed strong agreement in most cases.A second, novel, method for segmenting four-dimensional CT data was also proposed. The hardware implementation executed 1.95 times faster than the software implementation. However, the algorithm was found to be unsuitable for the global segmentation task examined here, although it may be suitable as a refining segmentation in the context of a larger ART algorithm.Adaptive radiotherapy (ART) seeks to improve the accuracy of radiotherapy by adapting the treatment based on up-to-date images of the patient's anatomy captured at the time of treatment delivery. The amount of image data, combined with the clinical time requirements for ART, necessitates automatic image analysis to adapt the treatment plan. Currently, the computational effort of the image processing and plan adaptation means they cannot be completed in a clinically acceptable timeframe. This thesis aims to investigate the use of hardware acceleration on Field Programmable Gate Arrays (FPGAs) to accelerate algorithms for segmenting bony anatomy in Computed Tomography (CT) scans, to reduce the plan adaptation time for ART. An assessment was made of the overhead incurred by transferring image data to an FPGA-based hardware accelerator using the industry-standard DICOM protocol over an Ethernet connection. The rate was found to be likely to limit the performanceof hardware accelerators for ART, highlighting the need for an alternative method of integrating hardware accelerators with existing radiotherapy equipment. A clinically-validated segmentation algorithm was adapted for implementation in hardware. This was shown to process three-dimensional CT images up to 13.81 times faster than the original software implementation. The segmentations produced by the two implementations showed strong agreement. Modifications to the hardware implementation were proposed for segmenting fourdimensional CT scans. This was shown to process image volumes 14.96 times faster than the original software implementation, and the segmentations produced by the two implementations showed strong agreement in most cases.A second, novel, method for segmenting four-dimensional CT data was also proposed. The hardware implementation executed 1.95 times faster than the software implementation. However, the algorithm was found to be unsuitable for the global segmentation task examined here, although it may be suitable as a refining segmentation in the context of a larger ART algorithm

    Black-, grey-, and white-box side-channel programming for software integrity checking

    Get PDF
    Doctor of PhilosophyDepartment of Computing and Information SciencesEugene VassermanChecking software integrity is a fundamental problem of system security. Many approaches have been proposed trying to enforce that a device runs the original code. Software-based methods such as hypervisors, separation kernels, and control flow integrity checking often rely on processors to provide some form of separation such as operation modes and memory protection. Hardware-based methods such as remote attestation, secure boot, and watchdog coprocessors rely on trusted hardware to execute attestation code such as verifying memory content and examining signatures appearing on buses. However, many embedded systems do not possess such sophisticated capabilities due to prohibitive hardware costs, unacceptably high power consumption, or the inability to update fielded components. Further, security assumption may become invalid as time goes by. For Systems-on-Chip (SoCs), in particular, internal activities cannot be observed directly, while in non-SoCs, sniffing bus traffic between constituent components may suffice for integrity checking. A promising approach to check software integrity for resource-constrained SoCs is through side-channels. Side-channels have been used mostly for attacks, such as eavesdropping from vibration of glass or plant leaves, fingerprinting machines from traffic patterns, or extracting secret key materials of cryptographic routines using power consumption measurements. In this work, side-channels are used to enhance rather than undercut security. First, we study the relationships between the internal states of a target device and side-channel information. We use the uncovered relationships to monitor the internal state of a running device and determine whether the internal state is an expected one. An unexpected state may be a sign of incorrect execution or malicious activity. To further explore the possibilities inherent in side-channel-based software integrity checking, we investigate various hardware platforms, representative of different degrees of knowledge of the hardware from the side-channel profiling point of view. In other words, side-channel information is extracted by black-, grey-, and white-box analysis. Each one involves unique challenges requiring different techniques to successfully derive “side-channel profiles”. We can use these profiles to detect unexpected states with extremely high probability, even when an adversary knows that their code may be subject to side-channel analysis, i.e., the methodology is robust to side-channel-aware adversaries. The research includes: (1) Constructing systematic approaches for black- and grey-box profiling of side channels (and comparing them to white-box analysis); (2) Designing custom measurement instrumentation; and (3) Developing techniques for monitoring and enforcing software integrity utilizing side-channel profiles. We introduce the term “side-channel programming” to refer to techniques we design in which developers explicitly utilize side-channel characteristics of existing hardware to optimize run-time software integrity checking, creating executable code which is more conducive to side-channel-based monitoring. Compared with other software integrity checking techniques, our approach has numerous benefits. Among them are that the measurement process is non-invasive, non-interruptive, and backward-compatible in that it does not require any hardware modification, meaning our approach works with processors that do not include security features. Our method can even be used to augment existing protection mechanism, as it works even when all security mechanisms internal to the device fail

    Circuits and Systems Advances in Near Threshold Computing

    Get PDF
    Modern society is witnessing a sea change in ubiquitous computing, in which people have embraced computing systems as an indispensable part of day-to-day existence. Computation, storage, and communication abilities of smartphones, for example, have undergone monumental changes over the past decade. However, global emphasis on creating and sustaining green environments is leading to a rapid and ongoing proliferation of edge computing systems and applications. As a broad spectrum of healthcare, home, and transport applications shift to the edge of the network, near-threshold computing (NTC) is emerging as one of the promising low-power computing platforms. An NTC device sets its supply voltage close to its threshold voltage, dramatically reducing the energy consumption. Despite showing substantial promise in terms of energy efficiency, NTC is yet to see widescale commercial adoption. This is because circuits and systems operating with NTC suffer from several problems, including increased sensitivity to process variation, reliability problems, performance degradation, and security vulnerabilities, to name a few. To realize its potential, we need designs, techniques, and solutions to overcome these challenges associated with NTC circuits and systems. The readers of this book will be able to familiarize themselves with recent advances in electronics systems, focusing on near-threshold computing

    A Methodology for Implementing RF BiSTs in Production Testing to Replace RF Conventional Tests

    Get PDF
    Production testing of Radio Frequency (RF) devices is challenging due to the complex nature of the tests that have to be performed to verify functionality. In this dissertation a methodology to replace the complex and expensive RF functional tests with defect-oriented Built-in Self Tests (BiSTs) is detailed. If a design has sufficient margin to RF specifications then RF tests can be replaced with structural tests using a new data analysis technique called quadrant analysis, which is presented. Data from the analysis of over one million production units of said System on Chip (SoC) is presented along with the results of the analysis. The BiST techniques that have been used are discussed and a Texas Instruments 65 nm RF SoC with a Bluetooth and a FM core was used as a case study. The defect models that were used to develop the BiSTs are discussed as well. The scenario in which a design does not have sufficient margin to specification is also discussed. The data analysis method required in such a case is a regression analysis and the data from such an analysis is shown. The results prove that it is possible to replace expensive RF conventional tests with structural tests and that modern RFCMOS process technology and advances in design like the Digital Radio Processor (DRPTM) technology enable this. The Defective Parts Per Million (DPPM) impact of making this replacement is 27 units and is acceptable for RFCMOS high volume products. Finally, data showing test cost reduction of about 38% that resulted from the elimination of RF conventional tests is presented

    High-level services for networks-on-chip

    Get PDF
    Future technology trends envision that next-generation Multiprocessors Systems-on- Chip (MPSoCs) will be composed of a combination of a large number of processing and storage elements interconnected by complex communication architectures. Communication and interconnection between these basic blocks play a role of crucial importance when the number of these elements increases. Enabling reliable communication channels between cores becomes therefore a challenge for system designers. Networks-on-Chip (NoCs) appeared as a strategy for connecting and managing the communication between several design elements and IP blocks, as required in complex Systems-on-Chip (SoCs). The topic can be considered as a multidisciplinary synthesis of multiprocessing, parallel computing, networking, and on- chip communication domains. Networks-on-Chip, in addition to standard communication services, can be employed for providing support for the implementation of system-level services. This dissertation will demonstrate how high-level services can be added to an MPSoC platform by embedding appropriate hardware/software support in the network interfaces (NIs) of the NoC. In this dissertation, the implementation of innovative modules acting in parallel with protocol translation and data transmission in NIs is proposed and evaluated. The modules can support the execution of the high-level services in the NoC at a relatively low cost in terms of area and energy consumption. Three types of services will be addressed and discussed: security, monitoring, and fault tolerance. With respect to the security aspect, this dissertation will discuss the implementation of an innovative data protection mechanism for detecting and preventing illegal accesses to protected memory blocks and/or memory mapped peripherals. The second aspect will be addressed by proposing the implementation of a monitoring system based on programmable multipurpose monitoring probes aimed at detecting NoC internal events and run-time characteristics. As last topic, new architectural solutions for the design of fault tolerant network interfaces will be presented and discussed
    • …
    corecore