533 research outputs found

    Fast Modular Reduction for Large-Integer Multiplication

    Get PDF
    The work contained in this thesis is a representation of the successful attempt to speed-up the modular reduction as an independent step of modular multiplication, which is the central operation in public-key cryptosystems. Based on the properties of Mersenne and Quasi-Mersenne primes, four distinct sets of moduli have been described, which are responsible for converting the single-precision multiplication prevalent in many of today\u27s techniques into an addition operation and a few simple shift operations. A novel algorithm has been proposed for modular folding. With the backing of the special moduli sets, the proposed algorithm is shown to outperform (speed-wise) the Modified Barrett algorithm by 80% for operands of length 700 bits, the least speed-up being around 70% for smaller operands, in the range of around 100 bits

    Digital PLL for ISM applications

    Get PDF
    In modern transceivers, a low power PLL is a key block. It is known that with the evolution of technology, lower power and high performance circuitry is a challenging demand. In this thesis, a low power PLL is developed in order not to exceed 2mW of total power consumption. It is composed by small area blocks which is one of the main demands. The blocks that compose the PLL are widely abridged and the final solution is shown, showing why it is employed. The VCO block is a Current-Starved Ring Oscillator with a frequency range from 400MHz to 1.5GHz, with a 300μW to approximately 660μW power consumption. The divider is composed by six TSPC D Flip-Flop in series, forming a divide-by-64 divider. The Phase-Detector is a Dual D Flip-Flop detector with a charge pump. The PLL has less than a 2us lock time and presents a output oscillation of 1GHz, as expected. It also has a total power consumption of 1.3mW, therefore fulfilling all the specifications. The main contributions of this thesis are that this PLL can be applied in ISM applications due to its covering frequency range and low cost 130nm CMOS technology

    Synthetic Aperture Radar (SAR) data processing

    Get PDF
    The available and optimal methods for generating SAR imagery for NASA applications were identified. The SAR image quality and data processing requirements associated with these applications were studied. Mathematical operations and algorithms required to process sensor data into SAR imagery were defined. The architecture of SAR image formation processors was discussed, and technology necessary to implement the SAR data processors used in both general purpose and dedicated imaging systems was addressed

    DSP IMPLEMENTATION OF A DIGITAL NON-LINEAR INTERVAL CONTROL ALGORITHM FOR A QUASI-KEYHOLE PLASMA ARC WELDING PROCESS

    Get PDF
    The Quasi-Keyhole plasma arc welding (PAW) process is a relatively simple concept, which provides a basis for controlling the weld quality of a subject work piece by cycling the arc current between a static base and variable peak level. Since the weld quality is directly related to the degree of penetration and amount of heat that is generated and maintained in the system, the Non-Linear Interval Control Algorithm provides a methodology for maintaining these parameters within acceptable limits by controlling the arc current based upon measured peak current times. The Texas Instruments TMS320VC5416 DSK working in conjunction with Signalwares AED-109 Data Converter provides a hardware solution to implement this control algorithm. This study outlines this configuration process and demonstrates its validity

    Enhancing speed and scalability of the ParFlow simulation code

    Full text link
    Regional hydrology studies are often supported by high resolution simulations of subsurface flow that require expensive and extensive computations. Efficient usage of the latest high performance parallel computing systems becomes a necessity. The simulation software ParFlow has been demonstrated to meet this requirement and shown to have excellent solver scalability for up to 16,384 processes. In the present work we show that the code requires further enhancements in order to fully take advantage of current petascale machines. We identify ParFlow's way of parallelization of the computational mesh as a central bottleneck. We propose to reorganize this subsystem using fast mesh partition algorithms provided by the parallel adaptive mesh refinement library p4est. We realize this in a minimally invasive manner by modifying selected parts of the code to reinterpret the existing mesh data structures. We evaluate the scaling performance of the modified version of ParFlow, demonstrating good weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test an example application at large scale.Comment: The final publication is available at link.springer.co
    corecore