533 research outputs found
Fast Modular Reduction for Large-Integer Multiplication
The work contained in this thesis is a representation of the successful attempt to speed-up the modular reduction as an independent step of modular multiplication, which is the central operation in public-key cryptosystems. Based on the properties of Mersenne and Quasi-Mersenne primes, four distinct sets of moduli have been described, which are responsible for converting the single-precision multiplication prevalent in many of today\u27s techniques into an addition operation and a few simple shift operations. A novel algorithm has been proposed for modular folding. With the backing of the special moduli sets, the proposed algorithm is shown to outperform (speed-wise) the Modified Barrett algorithm by 80% for operands of length 700 bits, the least speed-up being around 70% for smaller operands, in the range of around 100 bits
Digital PLL for ISM applications
In modern transceivers, a low power PLL is a key block. It is known that with the
evolution of technology, lower power and high performance circuitry is a challenging
demand.
In this thesis, a low power PLL is developed in order not to exceed 2mW of total power
consumption. It is composed by small area blocks which is one of the main demands.
The blocks that compose the PLL are widely abridged and the final solution is shown,
showing why it is employed. The VCO block is a Current-Starved Ring Oscillator with
a frequency range from 400MHz to 1.5GHz, with a 300μW to approximately 660μW
power consumption. The divider is composed by six TSPC D Flip-Flop in series, forming
a divide-by-64 divider. The Phase-Detector is a Dual D Flip-Flop detector with a charge
pump. The PLL has less than a 2us lock time and presents a output oscillation of 1GHz,
as expected. It also has a total power consumption of 1.3mW, therefore fulfilling all the
specifications.
The main contributions of this thesis are that this PLL can be applied in ISM applications
due to its covering frequency range and low cost 130nm CMOS technology
Synthetic Aperture Radar (SAR) data processing
The available and optimal methods for generating SAR imagery for NASA applications were identified. The SAR image quality and data processing requirements associated with these applications were studied. Mathematical operations and algorithms required to process sensor data into SAR imagery were defined. The architecture of SAR image formation processors was discussed, and technology necessary to implement the SAR data processors used in both general purpose and dedicated imaging systems was addressed
DSP IMPLEMENTATION OF A DIGITAL NON-LINEAR INTERVAL CONTROL ALGORITHM FOR A QUASI-KEYHOLE PLASMA ARC WELDING PROCESS
The Quasi-Keyhole plasma arc welding (PAW) process is a relatively simple concept, which provides a basis for controlling the weld quality of a subject work piece by cycling the arc current between a static base and variable peak level. Since the weld quality is directly related to the degree of penetration and amount of heat that is generated and maintained in the system, the Non-Linear Interval Control Algorithm provides a methodology for maintaining these parameters within acceptable limits by controlling the arc current based upon measured peak current times. The Texas Instruments TMS320VC5416 DSK working in conjunction with Signalwares AED-109 Data Converter provides a hardware solution to implement this control algorithm. This study outlines this configuration process and demonstrates its validity
Enhancing speed and scalability of the ParFlow simulation code
Regional hydrology studies are often supported by high resolution simulations
of subsurface flow that require expensive and extensive computations. Efficient
usage of the latest high performance parallel computing systems becomes a
necessity. The simulation software ParFlow has been demonstrated to meet this
requirement and shown to have excellent solver scalability for up to 16,384
processes. In the present work we show that the code requires further
enhancements in order to fully take advantage of current petascale machines. We
identify ParFlow's way of parallelization of the computational mesh as a
central bottleneck. We propose to reorganize this subsystem using fast mesh
partition algorithms provided by the parallel adaptive mesh refinement library
p4est. We realize this in a minimally invasive manner by modifying selected
parts of the code to reinterpret the existing mesh data structures. We evaluate
the scaling performance of the modified version of ParFlow, demonstrating good
weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test
an example application at large scale.Comment: The final publication is available at link.springer.co
- …