686 research outputs found

    Reliable and Fault-Resilient Schemes for Efficient Radix-4 Complex Division

    Get PDF
    Complex division is commonly used in various applications in signal processing and control theory including astronomy and nonlinear RF measurements. Nevertheless, unless reliability and assurance are embedded into the architectures of such structures, the suboptimal (and thus erroneous) results could undermine the objectives of such applications. As such, in this thesis, we present schemes to provide complex number division architectures based on (Sweeney, Robertson, and Tocher) SRT-division with fault diagnosis mechanisms. Different fault resilient architectures are proposed in this thesis which can be tailored based on the eventual objectives of the designs in terms of area and time requirements, among which we pinpoint carefully the schemes based on recomputing with shifted operands (RESO) to be able to detect both natural and malicious faults and with proper modification achieve high throughputs. The design also implements a minimized look up table approach which favors in error detection based designs and provides high fault coverage with relatively-low overhead. Additionally, to benchmark the effectiveness of the proposed schemes, extensive fault diagnosis assessments are performed for the proposed designs through fault simulations and FPGA implementations; the design is implemented on Xilinx Spartan-VI and Xilinx Virtex-VI FPGA families

    Concepts for on-board satellite image registration. Volume 3: Impact of VLSI/VHSIC on satellite on-board signal processing

    Get PDF
    Anticipated major advances in integrated circuit technology in the near future are described as well as their impact on satellite onboard signal processing systems. Dramatic improvements in chip density, speed, power consumption, and system reliability are expected from very large scale integration. Improvements are expected from very large scale integration enable more intelligence to be placed on remote sensing platforms in space, meeting the goals of NASA's information adaptive system concept, a major component of the NASA End-to-End Data System program. A forecast of VLSI technological advances is presented, including a description of the Defense Department's very high speed integrated circuit program, a seven-year research and development effort

    Reliable Low-Latency and Low-Complexity Viterbi Architectures Benchmarked on ASIC and FPGA

    Get PDF
    The Viterbi algorithm is commonly applied in a number of sensitive usage models including decoding convolutional codes used in communications such as satellite communication, cellular relay, and wireless local area networks. Moreover, the algorithm has been applied to automatic speech recognition and storage devices. In this thesis, efficient error detection schemes for architectures based on low-latency, low-complexity Viterbi decoders are presented. The merit of the proposed schemes is that reliability requirements, overhead tolerance, and performance degradation limits are embedded in the structures and can be adapted accordingly. We also present three variants of recomputing with encoded operands and its modifications to detect both transient and permanent faults, coupled with signature-based schemes. The instrumented decoder architecture has been subjected to extensive error detection assessments through simulations, and application-specific integrated circuit (ASIC) [32nm library] and field-programmable gate array (FPGA) [Xilinx Virtex-6 family] implementations for benchmark. The proposed fine-grained approaches can be utilized based on reliability objectives and performance/implementation metrics degradation tolerance

    Integrating emerging cryptographic engineering research and security education

    Get PDF
    Unlike traditional embedded systems such as secure smart cards, emerging secure deeply embedded systems, e.g., implantable and wearable medical devices, have larger “attack surface”. A security breach in such systems which are embedded deeply in human bodies or objects would be life-threatening, for which adopting traditional solutions might not be practical due to tight constraints of these often-battery-powered systems. Unfortunately, although emerging cryptographic engineering research mechanisms have started solving this critical problem, university education (at both graduate and undergraduate level) lags comparably. One of the pivotal reasons for such a lag is the multi-disciplinary nature of the emerging security bottlenecks (mathematics, engineering, science, and medicine, to name a few). Based on the aforementioned motivation, in this paper, we present an effective research and education integration strategy to overcome this issue at Rochester Institute of Technology. Moreover, we present the results of more than one year implementation of the presented strategy at graduate level through “side-channel analysis attacks” case studies. The results of the presented work show the success of the presented methodology while pinpointing the challenges encountered compared to traditional embedded system security research/teaching integration

    Education and Research Integration of Emerging Multidisciplinary Medical Devices Security

    Get PDF
    Traditional embedded systems such as secure smart cards and nano-sensor networks have been utilized in various usage models. Nevertheless, emerging secure deeply-embedded systems, e.g., implantable and wearable medical devices, have comparably larger “attack surface”. Specifically, with respect to medical devices, a security breach can be life-threatening (for which adopting traditional solutions might not be practical due to tight constraints of these often-battery-powered systems), and unlike traditional embedded systems, it is not only a matter of financial loss. Unfortunately, although emerging cryptographic engineering research mechanisms for such deeply-embedded systems have started solving this critical, vital problem, university education (at both graduate and undergraduate level) lags comparably. One of the pivotal reasons for such a lag is the multi-disciplinary nature of the emerging security bottlenecks. Based on the aforementioned motivation, in this work, at Rochester Institute of Technology, we present an effective research and education integration strategy to overcome this issue in one of the most critical deeply-embedded systems, i.e., medical devices. Moreover, we present the results of two years of implementation of the presented strategy at graduate-level through fault analysis attacks, a variant of side-channel attacks. We note that the authors also supervise an undergraduate student and the outcome of the presented work has been assessed for that student as well; however, the emphasis is on graduate-level integration. The results of the presented work show the success of the presented methodology while pinpointing the challenges encountered compared to traditional embedded system security research/teaching integration of medical devices security. We would like to emphasize that our integration approaches are general and scalable to other critical infrastructures as well

    Design of ALU and Cache Memory for an 8 bit ALU

    Get PDF
    The design of an ALU and a Cache memory for use in a high performance processor was examined in this thesis. Advanced architectures employing increased parallelism were analyzed to minimize the number of execution cycles needed for 8 bit integer arithmetic operations. In addition to the arithmetic unit, an optimized SRAM memory cell was designed to be used as cache memory and as fast Look Up Table. The ALU consists of stand alone units for bit parallel computation of basic integer arithmetic operations. Addition and subtraction were performed using Kogge Stone parallel prefix hardware operating at 330MHz. A high performance multiplier was built using Radix 4 Modified Booth Encoder (MBE) and a Wallace Tree summation array. The multiplier requires single clock cycle for 8 bit integer multiplication and operates at a maximum frequency of 100MHz. Multiplicative division hardware was built for executing both integer division and square root. The division hardware computes 8-bit division and square root in 4 clock cycles. Multiplier forms the basic building block of all these functional units, making high level of resource sharing feasible with this architecture. The optimal operating frequency for the arithmetic unit is 70MHz. A 6T CMOS SRAM cell measuring 90 µm2 was designed using minimum size transistors. The layout allows for horizontal overlap resulting in effective area of 76 µm2 for an 8x8 array. By substituting equivalent bit line capacitance of P4 L1 Cache, the memory was simulated to have a read time of 3.27ns. An optimized set of test vectors were identified to enable high fault coverage without the need for any additional test circuitry. Sixteen test cases were identified that would toggle all the nodes and provide all possible inputs to the sub units of the multiplier. A correlation based semi automatic method was investigated to facilitate test case identification for large multipliers. This method of testability eliminates performance and area overhead associated with conventional testability hardware. Bottom up design methodology was employed for the design. The performance and area metrics are presented along with estimated power consumption. A set of Monte Carlo analysis was carried out to ensure the dependability of the design under process variations as well as fluctuations in operating conditions. The arithmetic unit was found to require a total die area of 2mm2 (approx.) in 0.35 micron process

    Multidisciplinary Approaches and Challenges in Integrating Emerging Medical Devices Security Research and Education

    Get PDF
    Traditional embedded systems such as secure smart cards and nano-sensor networks have been utilized in various usage models. Nevertheless, emerging secure deeply-embedded systems, e.g., implantable and wearable medical devices, have comparably larger “attack surface”. Specifically, with respect to medical devices, a security breach can be life-threatening (for which adopting traditional solutions might not be practical due to tight constraints of these often-battery-powered systems), and unlike traditional embedded systems, it is not only a matter of financial loss. Unfortunately, although emerging cryptographic engineering research mechanisms for such deeply-embedded systems have started solving this critical, vital problem, university education (at both graduate and undergraduate level) lags comparably. One of the pivotal reasons for such a lag is the multi-disciplinary nature of the emerging security bottlenecks. Based on the aforementioned motivation, in this work, at Rochester Institute of Technology, we present an effective research and education integration strategy to overcome this issue in one of the most critical deeply-embedded systems, i.e., medical devices. Moreover, we present the results of two years of implementation of the presented strategy at graduate-level through fault analysis attacks, a variant of side-channel attacks. We note that the authors also supervise an undergraduate student and the outcome of the presented work has been assessed for that student as well; however, the emphasis is on graduate-level integration. The results of the presented work show the success of the presented methodology while pinpointing the challenges encountered compared to traditional embedded system security research/teaching integration of medical devices security. We would like to emphasize that our integration approaches are general and scalable to other critical infrastructures as well

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    A Study on Efficient Designs of Approximate Arithmetic Circuits

    Get PDF
    Approximate computing is a popular field where accuracy is traded with energy. It can benefit applications such as multimedia, mobile computing and machine learning which are inherently error resilient. Error introduced in these applications to a certain degree is beyond human perception. This flexibility can be exploited to design area, delay and power efficient architectures. However, care must be taken on how approximation compromises the correctness of results. This research work aims to provide approximate hardware architectures with error metrics and design metrics analyzed and their effects in image processing applications. Firstly, we study and propose unsigned array multipliers based on probability statistics and with approximate 4-2 compressors, full adders and half adders. This work deals with a new design approach for approximation of multipliers. The partial products of the multiplier are altered to introduce varying probability terms. Logic complexity of approximation is varied for the accumulation of altered partial products based on their probability. The proposed approximation is utilized in two variants of 16-bit multipliers. Synthesis results reveal that two proposed multipliers achieve power savings of 72% and 38% respectively compared to an exact multiplier. They have better precision when compared to existing approximate multipliers. Mean relative error distance (MRED) figures are as low as 7.6% and 0.02% for the proposed approximate multipliers, which are better than the previous state-of-the-art works. Performance of the proposed multipliers is evaluated with geometric mean filtering application, where one of the proposed models achieves the highest peak signal to noise ratio (PSNR). Second, approximation is proposed for signed Booth multiplication. Approximation is introduced in partial product generation and partial product accumulation circuits. In this work, three multipliers (ABM-M1, ABM-M2, and ABM-M3) are proposed in which the modified Booth algorithm is approximated. In all three designs, approximate Booth partial product generators are designed with different variations of approximation. The approximations are performed by reducing the logic complexity of the Booth partial product generator, and the accumulation of partial products is slightly modified to improve circuit performance. Compared to the exact Booth multiplier, ABM-M1 achieves up to 15% reduction in power consumption with an MRED value of 7.9 Ă— 10-4. ABM-M2 has power savings of up to 60% with an MRED of 1.1 Ă— 10-1. ABM-M3 has power savings of up to 50% with an MRED of 3.4 Ă— 10-3. Compared to existing approximate Booth multipliers, the proposed multipliers ABM-M1 and ABM-M3 achieve up to a 41% reduction in power consumption while exhibiting very similar error metrics. Image multiplication and matrix multiplication are used as case studies to illustrate the high performance of the proposed approximate multipliers. Third, distributed arithmetic based sum of products units approximation is analyzed. Sum of products units are key elements in many digital signal processing applications. Three approximate sum of products models which are based on distributed arithmetic are proposed. They are designed for different levels of accuracy. First model of approximate sum of products achieves an improvement up to 64% on area and 70% on power, when compared to conventional unit. Other two models provide an improvement of 32% and 48% on area and 54% and 58% on power, respectively, with a reduced error rate compared to the first model. Third model achieves MRED and normalized mean error distance (NMED) as low as 0.05% and 0.009%. Performance of approximate units is evaluated with a noisy image smoothing application, where the proposed models are capable of achieving higher PSNR than existing state of the art techniques. Fourth, approximation is applied in division architecture. Two approximation models are proposed for restoring divider. In the first design, approximation is performed at circuit level, where approximate divider cells are utilized in place of exact ones by simplifying the logic equations. In the second model, restoring divider is analyzed strategically and number of restoring divider cells are reduced by finding the portions of divisor and dividend with significant information. An approximation factor pp is used in both designs. In model 1, the design with p=8 has a 58% reduction in both area and power consumption compared to exact design, with a Q-MRED of 1.909 Ă— 10-2 and Q-NMED of 0.449 Ă— 10-2. The second model with an approximation factor p=4 has 54% area savings and 62% power savings compared to exact design. The proposed models are found to have better error metrics compared to existing designs, with better performance at similar error values. A change detection image processing application is used for real time assessment of proposed and existing approximate dividers and one of the models achieves a PSNR of 54.27 dB
    • …
    corecore