462 research outputs found

    Comparing Halton and Sobol Sequences in Integral Evaluation

    Get PDF
    Halton and Sobol sequences are two of the most popular number sets used in quasi-Monte Carlo methods. These sequences are effectively used instead of pseudo random numbers in the evaluation of integrals. In this paper, the two sequences are compared in terms of the size of the number sets and dimensionality. The comparison is implemented with matlab programming for evaluating numerical integrals. The absolute error, which is the absolute difference between the exact and estimated errors, is plotted against dimensions for different functions. The practical results show that, except the first dimension, Sobol sequence is better than Halton sequence. The results also show that Sobol sequence outputs are more stable

    Development of a Quasi-Monte Carlo Method for Thermal Radiation

    Get PDF
    Radiative heat transfer in participating media is among the most challenging computational engineering problems due to the complex nonlinear, nonlocal nature of radiation transport. Many approximate methods have been developed in order to resolve radiative heat transfer in participating media; but approximate methods, by the nature of their approximations, suffer from various shortcomings both in terms of accuracy and robustness. The only methods that can resolve radiative transfer accurately in all configurations are the statistical Monte Carlo-based methods. While the Monte Carlo (MC) method is the most accurate method for resolving radiative heat transfer, it is also notoriously computationally prohibitive in large-scale simulations. To overcome this computational burden, this study details the development of a quasi-Monte Carlo (QMC) method for thermal radiation in participating media with a focus on combustion-related problems. The QMC method employs a low-discrepancy sequence (LDS) in place of the traditional random number sampling mechanism used in Monte Carlo methods to increase computational efficiency. In order to analyze the performance of the QMC method, a systematic comparison of accuracy and computational expense was performed. The QMC method was validated against formal solutions of radiative heat transfer in several one-dimensional configurations and extended to three practical combustion configurations: a turbulent jet flame, a high-pressure industrial gas turbine, and a high-pressure spray combustion chamber. The results from QMC and traditional Monte Carlo are compared against benchmark solutions for each case. It is shown that accuracy of the predicted radiation field from QMC is comparable to MC at lower computational costs. Three different low-discrepancy sequences – Sobol, Halton, and Niederreiter – were examined as part of this work. Finally, recommendations are made in terms of choice of the sequence and the number of the dimensions of the LDS for combustion-relevant configurations. In conclusion, significant improvements in computational costs and accuracy seen in the QMC method makes it a viable alternative to traditional Monte Carlo methods in high-fidelity simulations

    Low-Impact Profiling of Streaming, Heterogeneous Applications

    Get PDF
    Computer engineers are continually faced with the task of translating improvements in fabrication process technology: i.e., Moore\u27s Law) into architectures that allow computer scientists to accelerate application performance. As feature-size continues to shrink, architects of commodity processors are designing increasingly more cores on a chip. While additional cores can operate independently with some tasks: e.g. the OS and user tasks), many applications see little to no improvement from adding more processor cores alone. For many applications, heterogeneous systems offer a path toward higher performance. Significant performance and power gains have been realized by combining specialized processors: e.g., Field-Programmable Gate Arrays, Graphics Processing Units) with general purpose multi-core processors. Heterogeneous applications need to be programmed differently than traditional software. One approach, stream processing, fits these systems particularly well because of the segmented memories and explicit expression of parallelism. Unfortunately, debugging and performance tools that support streaming, heterogeneous applications do not exist. This dissertation presents TimeTrial, a performance measurement system that enables performance optimization of streaming applications by profiling the application deployed on a heterogeneous system. TimeTrial performs low-impact measurements by dedicating computing resources to monitoring and by aggressively compressing performance traces into statistical summaries guided by user specification of the performance queries of interest

    Automated Dynamic Error Analysis Methods for Optimization of Computer Arithmetic Systems

    Get PDF
    Computer arithmetic is one of the more important topics within computer science and engineering. The earliest implementations of computer systems were designed to perform arithmetic operations and cost if not all digital systems will be required to perform some sort of arithmetic as part of their normal operations. This reliance on the arithmetic operations of computers means the accurate representation of real numbers within digital systems is vital, and an understanding of how these systems are implemented and their possible drawbacks is essential in order to design and implement modern high performance systems. At present the most widely implemented system for computer arithmetic is the IEEE754 Floating Point system, while this system is deemed to the be the best available implementation it has several features that can result in serious errors of computation if not implemented correctly. Lack of understanding of these errors and their effects has led to real world disasters in the past on several occasions. Systems for the detection of these errors are highly important and fast, efficient and easy to use implementations of these detection systems is a high priority. Detection of floating point rounding errors normally requires run-time analysis in order to be effective. Several systems have been proposed for the analysis of floating point arithmetic including Interval Arithmetic, Affine Arithmetic and Monte Carlo Arithmetic. While these systems have been well studied using theoretical and software based approaches, implementation of systems that can be applied to real world situations has been limited due to issues with implementation, performance and scalability. The majority of implementations have been software based and have not taken advantage of the performance gains associated with hardware accelerated computer arithmetic systems. This is especially problematic when it is considered that systems requiring high accuracy will often require high performance. The aim of this thesis and associated research is to increase understanding of error and error analysis methods through the development of easy to use and easy to understand implementations of these techniques

    Automated Dynamic Error Analysis Methods for Optimization of Computer Arithmetic Systems

    Get PDF
    Computer arithmetic is one of the more important topics within computer science and engineering. The earliest implementations of computer systems were designed to perform arithmetic operations and cost if not all digital systems will be required to perform some sort of arithmetic as part of their normal operations. This reliance on the arithmetic operations of computers means the accurate representation of real numbers within digital systems is vital, and an understanding of how these systems are implemented and their possible drawbacks is essential in order to design and implement modern high performance systems. At present the most widely implemented system for computer arithmetic is the IEEE754 Floating Point system, while this system is deemed to the be the best available implementation it has several features that can result in serious errors of computation if not implemented correctly. Lack of understanding of these errors and their effects has led to real world disasters in the past on several occasions. Systems for the detection of these errors are highly important and fast, efficient and easy to use implementations of these detection systems is a high priority. Detection of floating point rounding errors normally requires run-time analysis in order to be effective. Several systems have been proposed for the analysis of floating point arithmetic including Interval Arithmetic, Affine Arithmetic and Monte Carlo Arithmetic. While these systems have been well studied using theoretical and software based approaches, implementation of systems that can be applied to real world situations has been limited due to issues with implementation, performance and scalability. The majority of implementations have been software based and have not taken advantage of the performance gains associated with hardware accelerated computer arithmetic systems. This is especially problematic when it is considered that systems requiring high accuracy will often require high performance. The aim of this thesis and associated research is to increase understanding of error and error analysis methods through the development of easy to use and easy to understand implementations of these techniques

    Methodology for complex dataflow application development

    Get PDF
    This thesis addresses problems inherent to the development of complex applications for reconfig- urable systems. Many projects fail to complete or take much longer than originally estimated by relying on traditional iterative software development processes typically used with conventional computers. Even though designer productivity can be increased by abstract programming and execution models, e.g., dataflow, development methodologies considering the specific properties of reconfigurable systems do not exist. The first contribution of this thesis is a design methodology to facilitate systematic develop- ment of complex applications using reconfigurable hardware in the context of High-Performance Computing (HPC). The proposed methodology is built upon a careful analysis of the original application, a software model of the intended hardware system, an analytical prediction of performance and on-chip area usage, and an iterative architectural refinement to resolve identi- fied bottlenecks before writing a single line of code targeting the reconfigurable hardware. It is successfully validated using two real applications and both achieve state-of-the-art performance. The second contribution extends this methodology to provide portability between devices in two steps. First, additional tool support for contemporary multi-die Field-Programmable Gate Arrays (FPGAs) is developed. An algorithm to automatically map logical memories to hetero- geneous physical memories with special attention to die boundaries is proposed. As a result, only the proposed algorithm managed to successfully place and route all designs used in the evaluation while the second-best algorithm failed on one third of all large applications. Second, best practices for performance portability between different FPGA devices are collected and evaluated on a financial use case, showing efficient resource usage on five different platforms. The third contribution applies the extended methodology to a real, highly demanding emerging application from the radiotherapy domain. A Monte-Carlo based simulation of dose accumu- lation in human tissue is accelerated using the proposed methodology to meet the real time requirements of adaptive radiotherapy.Open Acces

    An Adaptive Modular Redundancy Technique to Self-regulate Availability, Area, and Energy Consumption in Mission-critical Applications

    Get PDF
    As reconfigurable devices\u27 capacities and the complexity of applications that use them increase, the need for self-reliance of deployed systems becomes increasingly prominent. A Sustainable Modular Adaptive Redundancy Technique (SMART) composed of a dual-layered organic system is proposed, analyzed, implemented, and experimentally evaluated. SMART relies upon a variety of self-regulating properties to control availability, energy consumption, and area used, in dynamically-changing environments that require high degree of adaptation. The hardware layer is implemented on a Xilinx Virtex-4 Field Programmable Gate Array (FPGA) to provide self-repair using a novel approach called a Reconfigurable Adaptive Redundancy System (RARS). The software layer supervises the organic activities within the FPGA and extends the self-healing capabilities through application-independent, intrinsic, evolutionary repair techniques to leverage the benefits of dynamic Partial Reconfiguration (PR). A SMART prototype is evaluated using a Sobel edge detection application. This prototype is shown to provide sustainability for stressful occurrences of transient and permanent fault injection procedures while still reducing energy consumption and area requirements. An Organic Genetic Algorithm (OGA) technique is shown capable of consistently repairing hard faults while maintaining correct edge detector outputs, by exploiting spatial redundancy in the reconfigurable hardware. A Monte Carlo driven Continuous Markov Time Chains (CTMC) simulation is conducted to compare SMART\u27s availability to industry-standard Triple Modular Technique (TMR) techniques. Based on nine use cases, parameterized with realistic fault and repair rates acquired from publically available sources, the results indicate that availability is significantly enhanced by the adoption of fast repair techniques targeting aging-related hard-faults. Under harsh environments, SMART is shown to improve system availability from 36.02% with lengthy repair techniques to 98.84% with fast ones. This value increases to five nines (99.9998%) under relatively more favorable conditions. Lastly, SMART is compared to twenty eight standard TMR benchmarks that are generated by the widely-accepted BL-TMR tools. Results show that in seven out of nine use cases, SMART is the recommended technique, with power savings ranging from 22% to 29%, and area savings ranging from 17% to 24%, while still maintaining the same level of availability

    Reconfigurable Antennas for Beam-Space MIMO Transmission with a Single Radio

    Get PDF
    MIMO techniques allow remarkable improvements in the reliability and/or transmission rate of wireless communication systems. However, there are several major challenges towards the implementation of conventional MIMO concept in terminals with size, cost, and power constraints. Firstly, insufficient space impedes the design of efficient and decorrelated MIMO antennas. Second, MIMO traditionally demands each antenna to be fed by its own RF chain, which in turn results in greater hardware complexity, larger power consumption, and higher implementation cost. Among all reduced-complexity and antenna-decoupling schemes proposed so far, the so-called beam-space MIMO has attracted a great deal of interest as a potential solution for addressing both problems concurrently. The key idea therein is to engineer the radiation pattern of a single-feed antenna structure for each symbol period, such that multiple independent symbols directly modulate a predefined set of orthogonal virtual patterns in the far-field, therefore allowing true MIMO transmission using a single RF chain and a compact antenna structure. More important in practice, the transmitted information can be retrieved using a conventional MIMO receiver. However, the transformation of this idea into reality entails dealing with various practical aspects that are commonly overlooked in theoretical and conceptual developments. This dissertation explores the beam-space MIMO concept from the perspective of the antenna engineering, and aims at addressing several key issues associated with the actual design and implementation of beam-space MIMO systems. The early developments of beam-space MIMO concerned switched parasitic arrays. However, the requirement of utilizing several physically-separate radiators is inconvenient for practicable implementation in compact portable devices. To solve this problem, a single-radiator load-modulated antenna solution is proposed in this dissertation. Another primary challenge consists in emulating high-order modulation schemes such as PSK with realistic hardware. Here, an efficient beam-space MIMO strategy is developed, which allows transmitting PSK data streams of any modulation order using only purely reactive reconfigurable loads, and without the need for a symbol-rate dynamic matching network. The approach is illustrated by the design and fabrication of a realistic antenna for QPSK signaling. The performance of a beam-space MIMO system which utilizes the fabricated antenna is then investigated through over-the-air experiments, and compared with conventional MIMO in realistic environments. Embedding information in the radiation patterns, beam-space MIMO systems are expected to be inherently prone to multiplexing performance degradation in the presence of external field perturbation. This makes the study of near-field interaction influence on beam-space MIMO distinct from those carried out for the case of conventional systems. This issue is considered for the first time in this dissertation. Moreover, like any reconfigurable system, a beam-space MIMO system may suffer from bandwidth expansion of the transmitted signals. The final part of the work is directed towards this important issue. To reduce out-of-band radiation effect, a solution based on shaping the time-domain response of the reconfigurable components is presented. The studies presented in this thesis constitute a crucial step towards MIMO with simpler and cheaper hardware for real-life terminals
    • …
    corecore