19 research outputs found
Asymmetrical three-phase fault evaluation in a distribution network using the genetic algorithm and the particle swarm optimisation
Abstract: Modern electric power systems are made up of three main sub-systems: generation; transmission; and distribution. The most common faults in distribution sub-systems are asymmetrical three-phase short circuit faults due to the fact that asymmetrical three-phase faults can be: line-to-line faults; two lines-to-earth faults; and single line-to-earth faults. This increases their probability of occurrence, unlike symmetrical three-phase faults which can only occur when all the three phases have been simultaneously shorted. Standard IEC 60909 and IEC 61363 provide all the basic information that is used for the detection of short circuit faults. However, the two standards use numerous estimates in their faults evaluation procedures. They estimate voltage factors (c), impedance correction factors (k), resistance to reactance ratios (R/X), resistance to impedance ratios (R/Z) and various other scaling factors for rotating machines. These IEC estimates are not evenly distributed throughout the 550kV and as such, they do not sufficiently cater for every nominal voltage. When the need arises, the user has to estimate these values accordingly. This research presents a genetic algorithm (GA) and a particle swarm optimisation (PSO) for the detection of asymmetrical three-phase short circuit faults within electric distribution networks of power systems with nominal voltages less than 550kV. GA and PSO are nature-inspired optimisation techniques. Although PSO has quick convergence, it suffers from partial optimism and premature stagnation. Some innovative coding adjustments were made in the creation of initial positions and particle distribution within the swarm. The GA struggles with: survival rates of individuals; stalling during optimisation; and proper gene replacements. Coding adjustments were also made to GA with regards to: strategic gene replacements; crossover when combining the properties of parents; and the arrangement of scores and expectation. Pattern search and Fmincon algorithms were also added to both algorithms as minimisation functions that commence after the evolutionary algorithms (EAs) terminate. The EAs were initially tested on the Rastrigin and Rosenbrock functions to ensure their efficiencies. During fault detection, the developed EAs were used to stochastically determine some of the most crucial estimates (R/X and R/Z ratios). The proposed methodology would compute these values on a case-to-case basis for every optimisation case with regards to the parameters and unique specifications of the power system. The EAs were tested on a nominal voltage that is properly catered for by Standard IEC. They obtained ratios, impedances and currents that were within an approximate range to the IEC values for that nominal voltage. This further implies that EAs can be reliably used to: stochastically determine these ratios; compute impedances; and detect fault currents for all the nominal voltages including those that are not sufficiently catered for by Standard IEC. Since R/X and R/Z ratios play a key role in determining the upstream and fault point impedances, the proposed methodology can be used to compute much more precise fault magnitudes at various network levels thereby setting up and repairing power systems sufficiently.M.Ing. (Electrical and Electronic Engineering Science
High performance reconfigurable architectures for biological sequence alignment
Bioinformatics and computational biology (BCB) is a rapidly developing
multidisciplinary field which encompasses a wide range of domains, including genomic
sequence alignments. It is a fundamental tool in molecular biology in searching for
homology between sequences. Sequence alignments are currently gaining close attention due
to their great impact on the quality aspects of life such as facilitating early disease diagnosis,
identifying the characteristics of a newly discovered sequence, and drug engineering. With
the vast growth of genomic data, searching for a sequence homology over huge databases
(often measured in gigabytes) is unable to produce results within a realistic time, hence the
need for acceleration. Since the exponential increase of biological databases as a result of the
human genome project (HGP), supercomputers and other parallel architectures such as the
special purpose Very Large Scale Integration (VLSI) chip, Graphic Processing Unit (GPUs)
and Field Programmable Gate Arrays (FPGAs) have become popular acceleration platforms.
Nevertheless, there are always trade-off between area, speed, power, cost, development time
and reusability when selecting an acceleration platform. FPGAs generally offer more
flexibility, higher performance and lower overheads. However, they suffer from a relatively
low level programming model as compared with off-the-shelf microprocessors such as
standard microprocessors and GPUs. Due to the aforementioned limitations, the need has
arisen for optimized FPGA core implementations which are crucial for this technology to
become viable in high performance computing (HPC).
This research proposes the use of state-of-the-art reprogrammable system-on-chip
technology on FPGAs to accelerate three widely-used sequence alignment algorithms; the
Smith-Waterman with affine gap penalty algorithm, the profile hidden Markov model
(HMM) algorithm and the Basic Local Alignment Search Tool (BLAST) algorithm. The
three novel aspects of this research are firstly that the algorithms are designed and
implemented in hardware, with each core achieving the highest performance compared to the
state-of-the-art. Secondly, an efficient scheduling strategy based on the double buffering
technique is adopted into the hardware architectures. Here, when the alignment matrix
computation task is overlapped with the PE configuration in a folded systolic array, the
overall throughput of the core is significantly increased. This is due to the bound PE
configuration time and the parallel PE configuration approach irrespective of the number of
PEs in a systolic array. In addition, the use of only two configuration elements in the PE optimizes hardware resources and enables the scalability of PE systolic arrays without
relying on restricted onboard memory resources. Finally, a new performance metric is
devised, which facilitates the effective comparison of design performance between different
FPGA devices and families. The normalized performance indicator (speed-up per area per
process technology) takes out advantages of the area and lithography technology of any
FPGA resulting in fairer comparisons.
The cores have been designed using Verilog HDL and prototyped on the Alpha Data
ADM-XRC-5LX card with the Virtex-5 XC5VLX110-3FF1153 FPGA. The implementation
results show that the proposed architectures achieved giga cell updates per second (GCUPS)
performances of 26.8, 29.5 and 24.2 respectively for the acceleration of the Smith-Waterman
with affine gap penalty algorithm, the profile HMM algorithm and the BLAST algorithm. In
terms of speed-up improvements, comparisons were made on performance of the designed
cores against their corresponding software and the reported FPGA implementations. In the
case of comparison with equivalent software execution, acceleration of the optimal
alignment algorithm in hardware yielded an average speed-up of 269x as compared to the
SSEARCH 35 software. For the profile HMM-based sequence alignment, the designed core
achieved speed-up of 103x and 8.3x against the HMMER 2.0 and the latest version of
HMMER (version 3.0) respectively. On the other hand, the implementation of the gapped
BLAST with the two-hit method in hardware achieved a greater than tenfold speed-up
compared to the latest NCBI BLAST software. In terms of comparison against other reported
FPGA implementations, the proposed normalized performance indicator was used to
evaluate the designed architectures fairly. The results showed that the first architecture
achieved more than 50 percent improvement, while acceleration of the profile HMM
sequence alignment in hardware gained a normalized speed-up of 1.34. In the case of the
gapped BLAST with the two-hit method, the designed core achieved 11x speed-up after
taking out advantages of the Virtex-5 FPGA. In addition, further analysis was conducted in
terms of cost and power performances; it was noted that, the core achieved 0.46 MCUPS per
dollar spent and 958.1 MCUPS per watt. This shows that FPGAs can be an attractive
platform for high performance computation with advantages of smaller area footprint as well
as represent economic ‘green’ solution compared to the other acceleration platforms. Higher
throughput can be achieved by redeploying the cores on newer, bigger and faster FPGAs
with minimal design effort
Microarchitecture Choices and Tradeoffs for Maximizing Processing Efficiency.
This thesis is concerned with hardware approaches for maximizing the number of independent instructions in the execution core and thereby maximizing the processing efficiency for a given amount of compute bandwidth. Compute bandwidth is the number of parallel execution units multiplied by the pipelining of those units in the processor. Keeping those computing elements busy is key to maximize processing efficiency and therefore power efficiency.
While some applications have many independent instructions that can be issued in parallel without inefficiencies due to branch behavior, cache behavior, or instruction dependencies, most applications have limited parallelism and plenty of stalling conditions.
This thesis presents two approaches to this problem, which in combination greatly increases the efficiency of the processor utilization of resources. The first approach addresses the problem of small basic blocks that arise when code has frequent branches. We introduce algorithms and mechanisms to predict multiple branches simultaneously and to fetch multiple non-continuous basic blocks every cycle along a predicted branch path. This makes what was previously an inherently serial process into a parallelized instruction fetch approach. For integer applications, the result is an increase in useful instruction fetch capacity of 40% when two basic blocks are fetched per cycle and 63% for three blocks per cycle. For floating point benchmarks, the associated improvement is 27% and 59%.
The second approach addresses increasing the number of independent instructions to the execution core through simultaneous multi-threading (SMT). We compare to another multithreading approach, Switch-on-Event multithreading, and show that SMT is far superior. Intel Pentium 4 SMT microarchitecture algorithms are analyzed, and we look at the impact of SMT on power efficiency of the Pentium 4 Processor. A new metric, the SMT Energy Benefit is defined. Not only do we show that the SMT Energy Benefit for a given workload with SMT can be quite significant, we also generalize the results and build a model for what other future processors’ SMT Energy Benefit would be. We conclude that while SMT will continue to be an energy-efficient feature, as processors get more energy-efficient in general the relative SMT Energy Benefit may be reduced.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/61740/1/dtmarr_1.pd
Recommended from our members
SCALE: A modular code system for performing standardized computer analyses for licensing evaluation
This Manual represents Revision 5 of the user documentation for the modular code system referred to as SCALE. The history of the SCALE code system dates back to 1969 when the current Computational Physics and Engineering Division at Oak Ridge National Laboratory (ORNL) began providing the transportation package certification staff at the U.S. Atomic Energy Commission with computational support in the use of the new KENO code for performing criticality safety assessments with the statistical Monte Carlo method. From 1969 to 1976 the certification staff relied on the ORNL staff to assist them in the correct use of codes and data for criticality, shielding, and heat transfer analyses of transportation packages. However, the certification staff learned that, with only occasional use of the codes, it was difficult to become proficient in performing the calculations often needed for an independent safety review. Thus, shortly after the move of the certification staff to the U.S. Nuclear Regulatory Commission (NRC), the NRC staff proposed the development of an easy-to-use analysis system that provided the technical capabilities of the individual modules with which they were familiar. With this proposal, the concept of the Standardized Computer Analyses for Licensing Evaluation (SCALE) code system was born. This manual covers an array of modules written for the SCALE package, consisting of drivers, system libraries, cross section and materials properties libraries, input/output routines, storage modules, and help files
Recommended from our members
SCALE: A modular code system for performing standardized computer analyses for licensing evaluation: Control modules C4, C6
This Manual represents Revision 5 of the user documentation for the modular code system referred to as SCALE. The history of the SCALE code system dates back to 1969 when the current Computational Physics and Engineering Division at Oak Ridge National Laboratory (ORNL) began providing the transportation package certification staff at the U. S. Atomic Energy Commission with computational support in the use of the new KENO code for performing criticality safety assessments with the statistical Monte Carlo method. From 1969 to 1976 the certification staff relied on the ORNL staff to assist them in the correct use of codes and data for criticality, shielding, and heat transfer analyses of transportation packages. However, the certification staff learned that, with only occasional use of the codes, it was difficult to become proficient in performing the calculations often needed for an independent safety review. Thus, shortly after the move of the certification staff to the U.S. Nuclear Regulatory Commission (NRC), the NRC staff proposed the development of an easy-to-use analysis system that provided the technical capabilities of the individual modules with which they were familiar. With this proposal, the concept of the Standardized Computer Analyses for Licensing Evaluation (SCALE) code system was born. This volume is part of the manual related to the control modules for the newest updated version of this computational package