Search CORE

559 research outputs found

Analytical Modeling of High Performance Reconfigurable Computers: Prediction and Analysis of System Performance.

Author: Smith Melissa C.
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/01/2002
Field of study

The use of a network of shared, heterogeneous workstations each harboring a Reconfigurable Computing (RC) system offers high performance users an inexpensive platform for a wide range of computationally demanding problems. However, effectively using the full potential of these systems can be challenging without the knowledge of the system’s performance characteristics. While some performance models exist for shared, heterogeneous workstations, none thus far account for the addition of Reconfigurable Computing systems. This dissertation develops and validates an analytic performance modeling methodology for a class of fork-join algorithms executing on a High Performance Reconfigurable Computing (HPRC) platform. The model includes the effects of the reconfigurable device, application load imbalance, background user load, basic message passing communication, and processor heterogeneity. Three fork-join class of applications, a Boolean Satisfiability Solver, a Matrix-Vector Multiplication algorithm, and an Advanced Encryption Standard algorithm are used to validate the model with homogeneous and simulated heterogeneous workstations. A synthetic load is used to validate the model under various loading conditions including simulating heterogeneity by making some workstations appear slower than others by the use of background loading. The performance modeling methodology proves to be accurate in characterizing the effects of reconfigurable devices, application load imbalance, background user load and heterogeneity for applications running on shared, homogeneous and heterogeneous HPRC resources. The model error in all cases was found to be less than five percent for application runtimes greater than thirty seconds and less than fifteen percent for runtimes less than thirty seconds. The performance modeling methodology enables us to characterize applications running on shared HPRC resources. Cost functions are used to impose system usage policies and the results of vii the modeling methodology are utilized to find the optimal (or near-optimal) set of workstations to use for a given application. The usage policies investigated include determining the computational costs for the workstations and balancing the priority of the background user load with the parallel application. The applications studied fall within the Master-Worker paradigm and are well suited for a grid computing approach. A method for using NetSolve, a grid middleware, with the model and cost functions is introduced whereby users can produce optimal workstation sets and schedules for Master-Worker applications running on shared HPRC resources

University of Tennessee, Knoxville: Trace

CiteSeerX

Combining Cubic Dynamical Solvers with Make/Break Heuristics to Solve SAT

Author: Burns Matthew
Huang Michael C.
Sharma Anshujit
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 26th International Conference on Theory and Applications of Satisfiability Testing (SAT 2023)
Publication date: 01/01/2023
Field of study

Dynamical solvers for combinatorial optimization are usually based on 2superscript{nd} degree polynomial interactions, such as the Ising model. These exhibit high success for problems that map naturally to their formulation. However, SAT requires higher degree of interactions. As such, these quadratic dynamical solvers (QDS) have shown poor solution quality due to excessive auxiliary variables and the resulting increase in search-space complexity. Thus recently, a series of cubic dynamical solver (CDS) models have been proposed for SAT and other problems. We show that such problem-agnostic CDS models still perform poorly on moderate to large problems, thus motivating the need to utilize SAT-specific heuristics. With this insight, our contributions can be summarized into three points. First, we demonstrate that existing make-only heuristics perform poorly on scale-free, industrial-like problems when integrated into CDS. This motivates us to utilize break counts as well. Second, we derive a relationship between make/break and the CDS formulation to efficiently recover break counts. Finally, we utilize this relationship to propose a new make/break heuristic and combine it with a state-of-the-art CDS which is projected to solve SAT problems several orders of magnitude faster than existing software solvers

Dagstuhl Research Online Publication Server

Design of Discrete-time Chaos-Based Systems for Hardware Security Applications

Author: Shanta Aysha
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2020
Field of study

Security of systems has become a major concern with the advent of technology. Researchers are proposing new security solutions every day in order to meet the area, power and performance specifications of the systems. The additional circuit required for security purposes can consume significant area and power. This work proposes a solution which utilizes discrete-time chaos-based logic gates to build a system which addresses multiple hardware security issues. The nonlinear dynamics of chaotic maps is leveraged to build a system that mitigates IC counterfeiting, IP piracy, overbuilding, disables hardware Trojan insertion and enables authentication of connecting devices (such as IoT and mobile). Chaos-based systems are also used to generate pseudo-random numbers for cryptographic applications.The chaotic map is the building block for the design of discrete-time chaos-based oscillator. The analog output of the oscillator is converted to digital value using a comparator in order to build logic gates. The logic gate is reconfigurable since different parameters in the circuit topology can be altered to implement multiple Boolean functions using the same system. The tuning parameters are control input, bifurcation parameter, iteration number and threshold voltage of the comparator. The proposed system is a hybrid between standard CMOS logic gates and reconfigurable chaos-based logic gates where original gates are replaced by chaos-based gates. The system works in two modes: logic locking and authentication. In logic locking mode, the goal is to ensure that the system achieves logic obfuscation in order to mitigate IC counterfeiting. The secret key for logic locking is made up of the tuning parameters of the chaotic oscillator. Each gate has 10-bit key which ensures that the key space is large which exponentially increases the computational complexity of any attack. In authentication mode, the aim of the system is to provide authentication of devices so that adversaries cannot connect to devices to learn confidential information. Chaos-based computing system is susceptible to process variation which can be leveraged to build a chaos-based PUF. The proposed system demonstrates near ideal PUF characteristics which means systems with large number of primary outputs can be used for authenticating devices

University of Tennessee, Knoxville: Trace

Practical Techniques for Improving Performance and Evaluating Security on Circuit Designs

Author: Xu Wenbin
Publication venue
Publication date: 20/11/2019
Field of study

As the modern semiconductor technology approaches to nanometer era, integrated circuits (ICs) are facing more and more challenges in meeting performance demand and security. With the expansion of markets in mobile and consumer electronics, the increasing demands require much faster delivery of reliable and secure IC products. In order to improve the performance and evaluate the security of emerging circuits, we present three practical techniques on approximate computing, split manufacturing and analog layout automation. Approximate computing is a promising approach for low-power IC design. Although a few accuracy-configurable adder (ACA) designs have been developed in the past, these designs tend to incur large area overheads as they rely on either redundant computing or complicated carry prediction. We investigate a simple ACA design that contains no redundancy or error detection/correction circuitry and uses very simple carry prediction. The simulation results show that our design dominates the latest previous work on accuracy-delay-power tradeoff while using 39% less area. One variant of this design provides finer-grained and larger tunability than that of the previous works. Moreover, we propose a delay-adaptive self-configuration technique to further improve the accuracy-delay-power tradeoff. Split manufacturing prevents attacks from an untrusted foundry. The untrusted foundry has front-end-of-line (FEOL) layout and the original circuit netlist and attempts to identify critical components on the layout for Trojan insertion. Although defense methods for this scenario have been developed, the corresponding attack technique is not well explored. Hence, the defense methods are mostly evaluated with the k-security metric without actual attacks. We develop a new attack technique based on structural pattern matching. Experimental comparison with existing attack shows that the new attack technique achieves about the same success rate with much faster speed for cases without the k-security defense, and has a much better success rate at the same runtime for cases with the k-security defense. The results offer an alternative and practical interpretation for k-security in split manufacturing. Analog layout automation is still far behind its digital counterpart. We develop the layout automation framework for analog/mixed-signal ICs. A hierarchical layout synthesis flow which works in bottom-up manner is presented. To ensure the qualified layouts for better circuit performance, we use the constraint-driven placement and routing methodology which employs the expert knowledge via design constraints. The constraint-driven placement uses simulated annealing process to find the optimal solution. The packing represented by sequence pairs and constraint graphs can simultaneously handle different kinds of placement constraints. The constraint-driven routing consists of two stages, integer linear programming (ILP) based global routing and sequential detailed routing. The experiment results demonstrate that our flow can handle complicated hierarchical designs with multiple design constraints. Furthermore, the placement performance can be further improved by using mixed-size block placement which works on large blocks in priority

Hardware Intellectual Property Protection Through Obfuscation Methods

Author: Blocklove Jason
Publication venue: RIT Scholar Works
Publication date: 30/06/2020
Field of study

Security is a growing concern in the hardware design world. At all stages of the Integrated Circuit (IC) lifecycle there are attacks which threaten to compromise the integrity of the design through piracy, reverse engineering, hardware Trojan insertion, physical attacks, and other side channel attacks — among other threats. Some of the most notable challenges in this field deal specifically with Intellectual Property (IP) theft and reverse engineering attacks. The IP being attacked can be ICs themselves, circuit designs making up those larger ICs, or configuration information for the devices like Field Programmable Gate Arrays (FPGAs). Custom or proprietary cryptographic components may require specific protections, as successfully attacking those could compromise the security of other aspects of the system. One method by which these concerns can be addressed is by introducing hardware obfuscation to the design in various forms. These methods of obfuscation must be evaluated for effectiveness and continually improved upon in order to match the growing concerns in this area. Several different forms of netlist-level hardware obfuscation were analyzed, on standard benchmarking circuits as well as on two substitution boxes from block ciphers. These obfuscation methods were attacked using a satisfiability (SAT) attack, which is able to iteratively rule out classes of keys at once and has been shown to be very effective against many forms of hardware obfuscation. It was ultimately shown that substitution boxes were naturally harder to break than the standard benchmarks using this attack, but some obfuscation methods still have substantially more security than others. The method which increased the difficulty of the attack the most was one which introduced a modified SIMON block cipher as a One-way Random Function (ORF) to be used for key generation. For a substitution box obfuscated in this way, the attack was found to be completely unsuccessful within a five-day window with a severely round-reduced implementation of SIMON and only a 32-bit obfuscation key

RIT Scholar Works

MaxSAT Evaluation 2020 : Solver and Benchmark Descriptions

Author
Publication venue: University of Helsinki, Department of Computer Science
Publication date: 01/01/2020
Field of study

Helsingin yliopiston digitaalinen arkisto

MaxSAT Evaluation 2020 : Solver and Benchmark Descriptions

Author
Publication venue: Department of Computer Science, University of Helsinki
Publication date: 01/01/2020
Field of study

Non peer reviewe

Helsingin yliopiston digitaalinen arkisto

Adaptive Integrated Circuit Design for Variation Resilience and Security

Author: Wang Jiafan
Publication venue
Publication date: 05/02/2018
Field of study

The past few decades witness the burgeoning development of integrated circuit in terms of process technology scaling. Along with the tremendous benefits coming from the scaling, challenges are also presented in various stages. During the design time, the complexity of developing a circuit with millions to billions of smaller size transistors is extended after the variations are taken into account. The difficulty of analyzing these nondeterministic properties makes the allocation scheme of redundant resource hardly work in a cost-efficient way. Besides fabrication variations, analog circuits are suffered from severe performance degradations owing to their physical attributes which are vulnerable to aging effects. As such, the post-silicon calibration approach gains increasing attentions to compensate the performance mismatch. For the user-end applications, additional system failures result from the pirated and counterfeited devices provided by the untrusted semiconductor supply chain. Again analog circuits show their weakness to this threat due to the shortage of piracy avoidance techniques. In this dissertation, we propose three adaptive integrated circuit designs to overcome these challenges respectively. The first one investigates the variability-aware gate implementation with the consideration of the overhead control of adaptivity assignment. This design improves the variation resilience typically for digital circuits while optimizing the power consumption and timing yield. The second design is implemented as a self-validation system for the calibration of diverse analog circuits. The system is completely integrated on chip to enhance the convenience without external assistance. In the last design, a classic analog component is further studied to establish the configurable locking mechanism for analog circuits. The use of Satisfiability Modulo Theories addresses the difficulty of searching the unique unlocking pattern of non-Boolean variables

Texas A&M Repository

Large-scale parallelism for constraint-based local search: the costas array case study

Author: Abreu Salvador
Caniou Yves
Codognet Philippe
Diaz Daniel
Richoux Florian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

International audienceWe present the parallel implementation of a constraint-based Local Search algorithm and investigate its performance on several hardware plat-forms with several hundreds or thousands of cores. We chose as the basis for these experiments the Adaptive Search method, an efficient sequential Local Search method for Constraint Satisfaction Problems (CSP). After preliminary experiments on some CSPLib benchmarks, we detail the modeling and solving of a hard combinatorial problem related to radar and sonar applications: the Costas Array Problem. Performance evaluation on some classical CSP bench-marks shows that speedups are very good for a few tens of cores, and good up to a few hundreds of cores. However for a hard combinatorial search problem such as the Costas Array Problem, performance evaluation of the sequential version shows results outperforming previous Local Search implementations, while the parallel version shows nearly linear speedups up to 8,192 cores. The proposed parallel scheme is simple and based on independent multi-walks with no communication between processes during search. We also investigated a cooperative multi-walk scheme where processes share simple information, but this scheme does not seem to improve performance

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Paris1

Repositório Científico da Universidade de Évora

Hal-Diderot