Search CORE

14 research outputs found

Data Structures & Algorithm Analysis in C++

Author: Shaffer Clifford A
Publication venue: Scholars Crossing
Publication date: 01/03/2013
Field of study

This is the textbook for CSIS 215 at Liberty University.https://digitalcommons.liberty.edu/textbooks/1005/thumbnail.jp

Liberty University Digital Commons

Recommended from our members

Efficient Algorithms for Road Networks and Noisy Sorting: an Experimental and Theoretical Perspective

Author: Ozel Evrim
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

Experimental algorithmics, also referred to as algorithm engineering, is the principled approach of using empirical methods to complement traditional theoretical methods, both of which provide valuable insights for the analysis of algorithms. In this dissertation, we study various algorithmic problems for road networks and noisy sorting, analyzing them from both an experimental and theoretical perspective. We first study the problem of exact learning for road networks, and introduce an efficient randomized algorithm using simple distance queries, which can find missing roads and improve the quality of routing services. We provide a partial theoretical analysis based on a cluster degree parameter, d_max, then empirically show that this parameter is small for road networks by evaluating our algorithm on road network data for the U.S. and 4 European countries of various sizes. This analysis provides experimental evidence that our algorithm issues a quasilinear number of queries in expectation for road networks and similar graphs. We also study the small-world navigability of the U.S. road network, inspired by the famous experiments done by Stanley Milgram which gave rise to the "six degrees of separation" expression. We introduce the Neighborhood Preferential Attachment Model, and perform extensive experiments with this model on U.S. road networks to show that our model outperforms other small-world models in terms of the average hop length, while having a more realistic scale-free degree distribution. We then study the problem of sorting n comparable distinct elements, subject to noisy comparison errors, such that the comparison of two elements returns the wrong answer according to a fixed probability p_e > 1/2. We provide new methods for sorting with comparison errors that are data oblivious while avoiding the use of noisy binary search methods. We then experimentally compare our algorithms and other sorting algorithms. Lastly, we study the noisy sorting problem in an external-memory setting, providing new efficient methods that are in the external-memory model for sorting with comparison errors. Our algorithms achieve an optimal number of I/Os, in both cache-aware and cache-oblivious settings

eScholarship - University of California

Recommended from our members

The Effectiveness of <i>t</i>-Way Test Data Generation

Author: Ellims Michael
Publication venue
Publication date: 01/01/2009
Field of study

Modern society is increasingly dependent on the correct functioning of software and increasingly so in areas that are considered safety related or safety critical. Therefore, there is an increasing need to be able to verify and validate that the software is in fact correct and will perform its intended function. Many approaches to this problem have been proposed; however, none seems likely to supplant the role of testing in the near future. If we accept that there is, and will be, a continuing need to be able to test software then the question becomes one of how can this be done effectively, both in terms of ability to detect errors and in terms of cost. One avenue of research that offers prospects of improving both of these aspects is the automatic generation of test data. There has recently been a large amount of work conducted in this area. One particularly promising direction has been the application of ideas from the field of experimental design and in particular, the field of t-way adequate factorial designs. The area however, is not without issues; there is evidence that the technique is capable of detecting errors but that evidence is not unequivocal. Moreover, as with almost all work in the area of automatic test generation, there has been very little comparative work comparing the technique with other test data generation techniques. Worse, there has been effectively no work done that compares any automatic test data generation technique with the effectiveness of tests generated by humans. Another major issue with the technique is the number of tests that applying the technique can result in. This implies that there is a need for an automated oracle if the technique is to be successfully applied. The flaw with this is of course that in most situations the oracle is the human that is conducting the tests, a point often ignored in testing research. The work presented here addresses both of these points. To do this I have used a code base taken from an industrial engine control system that has an existing set of high quality unit tests developed by hand. To complement this, several other techniques for automatically generating test data have been applied, namely random testing, random experimental designs and a technique for generating single factor experiments. To address the issue of being able to compare the error detection ability of all of the sets of test vectors, rather than the usual effectiveness surrogates of code coverage I have used mutation analysis on the code base to directly measure the ability of each set of test vectors to discover common coding errors. The results presented here show that test data generation techniques based on t-way factorial designs are at least as effective as handgenerated tests and superior to random testing and the factor experimental technique. The oracle problem associated with the factorial design techniques was addressed using a test set minimisation approach. The mutation tool monitored which vectors could “kill” which code mutants. After a subset of the test vectors had been run, the most effective vectors were retained and the rest discarded. Likewise, mutants that were killed were removed from further consideration and the process repeated. Experimental results show that this minimisation procedure is effective at reducing computational overhead and is capable of producing final sets of test vectors that are comparable in size with the sets of hand-generated tests and so amenable to final hand checking

Open Research Online (The Open University)

Improving performance of genetic algorithms by using novel fitness functions

Author: Cooper Jason
Publication venue
Publication date: 01/01/2006
Field of study

This thesis introduces Intelligent Fitness Functions and Partial Fitness Functions both of which can improve the performance of a genetic algorithm which is limited to a fixed run time. An Intelligent Fitness Function is defined as a fitness function with a memory. The memory is used to store information about individuals so that duplicate individuals do not need to have their fitness tested. Different types of memory (long and short term) and different storage strategies (fitness based, time base and frequency based) have been tested. The results show that an intelligent fitness function, with a time based long term memory improves the efficiency of a genetic algorithm the most. A Partial Fitness Function is defined as a fitness function that only partially tests the fitness of an individual at each generation. Thus only promising individuals get fully tested. Using a partial fitness function gives the genetic algorithm more evolutionary steps in the same length of time as a genetic algorithm using a normal fitness function. The results show that a genetic algorithm using a partial fitness function can achieve higher fitness levels than a genetic algorithm using a normal fitness function. Finally a genetic algorithm designed to solve a substitution cipher is compared to one equipped with an intelligent fitness function and another equipped with a partial fitness function. The genetic algorithm with the intelligent fitness function and the genetic algorithm with the partial fitness function both show a significant improvement over the genetic algorithm with a conventional fitness function.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

Loughborough University Institutional Repository

OpenGrey Repository

Fast Parallel Machine Learning Algorithms for Large Datasets Using Graphic Processing Unit

Author: Li Qi
Publication venue: VCU Scholars Compass
Publication date: 30/11/2011
Field of study

This dissertation deals with developing parallel processing algorithms for Graphic Processing Unit (GPU) in order to solve machine learning problems for large datasets. In particular, it contributes to the development of fast GPU based algorithms for calculating distance (i.e. similarity, affinity, closeness) matrix. It also presents the algorithm and implementation of a fast parallel Support Vector Machine (SVM) using GPU. These application tools are developed using Compute Unified Device Architecture (CUDA), which is a popular software framework for General Purpose Computing using GPU (GPGPU). Distance calculation is the core part of all machine learning algorithms because the closer the query is to some samples (i.e. observations, records, entries), the more likely the query belongs to the class of those samples. K-Nearest Neighbors Search (k-NNS) is a popular and powerful distance based tool for solving classification problem. It is the prerequisite for training local model based classifiers. Fast distance calculation can significantly improve the speed performance of these classifiers and GPUs can be very handy for their accelerations. Meanwhile, several GPU based sorting algorithms are also included to sort the distance matrix and seek for the k-nearest neighbors. The speed performances of the sorting algorithms vary depending upon the input sequences. The GPUKNN proposed in this dissertation utilizes the GPU based distance computation algorithm and automatically picks up the most suitable sorting algorithm according to the characteristics of the input datasets. Every machine learning tool has its own pros and cons. The advantage of SVM is the high classification accuracy. This makes SVM possibly the best classification tool. However, as in many other machine learning algorithms, SVM\u27s slow training phase slows down when the size of the input datasets increase. The GPU version of parallel SVM based on parallel Sequential Minimal Optimization (SMO) implemented in this dissertation is proposed to reduce the time cost in both training and predicting phases. This implementation of GPUSVM is original. It utilizes many parallel processing techniques to accelerate and minimize the computations of kernel evaluation, which are considered as the most time consuming operations in SVM. Although the many-core architecture of GPU performs the best in data level parallelism, multi-task (aka. task level parallelism) processing is also integrated into the application to improve the speed performance of tasks such as multiclass classification and cross-validation. Furthermore, the procedure of finding worst violators is distributed to multiple blocks on the CUDA model. This reduces the time cost for each iteration of SMO during the training phase. All of these violators are shared among different tasks in multiclass classification and cross-validation to reduce the duplicate kernel computations. The speed performance results have shown that the achieved speedup of both the training phase and predicting phase are ranging from one order of magnitude to three orders of magnitude times faster compared to the state of the art LIBSVM software on some well known benchmarking datasets

VCU Scholars Compass

Search-Based Temporal Testing of Multicore Applications

Author: Srivisut Komsan
Publication venue: University of York
Publication date: 01/09/2017
Field of study

Multicore systems are increasingly common as a modern computing platform. Multicore processors not only offer better performance-to-cost ratios relative to single-core processors but also have significantly minimised space, weight, and power (SWaP) constraints. Unfortunately, they introduce challenges in verification as their shared components are potential channels for interference. The potential for interference increases the possibility of concurrency faults at runtime and consequently increases the difficulty of verifying. In this thesis, search-based techniques are empirically investigated to determine their effectiveness in temporal testing—searching for test inputs that may lead a task running on an embedded multicore to produce extreme (here longest) execution times, which might cause the system to violate its temporal requirements. Overall, the findings suggest that various forms of search-based approaches are effective in generating test inputs exhibiting extreme execution times on the embedded multicore environment. All previous work in temporal testing has evolved test data directly; this is not essential. In this thesis, one novel proposed approach, i.e. the use of search to discover high performing biased random sampling regimes (which we call 'dependent input sampling strategies'), has proved particularly effective. Shifting the target of search from test data itself to strategies proves particularly well motivated for attaining extreme execution times. Finally, we present also preliminary results on the use of so-called 'hyper-heuristics', which can be used to form optimal hybrids of optimisation techniques. An extensive comparison of direct approaches to establishing a baseline is followed by reports of research into indirect approaches and hyper-heuristics. The shift to strategies from direct data can be thought of as a leap in abstraction level for the underlying temporal test data generation problem. The shift to hyper-heuristics aims to boost the level of optimisation technique abstraction. The former is more fully worked out than the latter and has proved a significant success. For the latter only preliminary results are available; as will be seen from this work as the whole computational requirements for research experimentation are significant

White Rose E-theses Online

Efficient fault-injection-based assessment of software-implemented hardware fault tolerance

Author: Schirmeier Horst Benjamin
Publication venue
Publication date: 01/01/2016
Field of study

With continuously shrinking semiconductor structure sizes and lower supply voltages, the per-device susceptibility to transient and permanent hardware faults is on the rise. A class of countermeasures with growing popularity is Software-Implemented Hardware Fault Tolerance (SIHFT), which avoids expensive hardware mechanisms and can be applied application-specifically. However, SIHFT can, against intuition, cause more harm than good, because its overhead in execution time and memory space also increases the figurative “attack surface” of the system – it turns out that application-specific configuration of SIHFT is in fact a necessity rather than just an advantage. Consequently, target programs need to be analyzed for particularly critical spots to harden. SIHFT-hardened programs need to be measured and compared throughout all development phases of the program to observe reliability improvements or deteriorations over time. Additionally, SIHFT implementations need to be tested. The contributions of this dissertation focus on Fault Injection (FI) as an assessment technique satisfying all these requirements – analysis, measurement and comparison, and test. I describe the design and implementation of an FI tool, named Fail*, that overcomes several shortcomings in the state of the art, and enables research on the general drawbacks of simulation-based FI. As demonstrated in four case studies in the context of SIHFT research, Fail* provides novel fine-grained analysis techniques that exploit the newly gained possibility to analyze FI results from complete fault-space exploration. These analysis techniques aid SIHFT design decisions on the level of program modules, functions, variables, source-code lines, or single machine instructions. Based on the experience from the case studies, I address the problem of large computation efforts that accompany exhaustive fault-space exploration from two different angles: Firstly, I develop a heuristical fault-space pruning technique that allows to freely trade the total FI-experiment count for result accuracy, while still providing information on all possible faultspace coordinates. Secondly, I speed up individual TAP-based FI experiments by improving the fast-forwarding operation by several orders of magnitude for most workloads. Finally, I dissect current practices in FI-based evaluation of SIHFT-hardened programs, identify three widespread pitfalls in the result interpretation, and advance the state of the art by defining a novel comparison metric

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

Author: ESA <27. 2019, München>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/09/2019
Field of study

Digitale Bibliothek Thüringen

31th International Symposium on Theoretical Aspects of Computer Science: STACS '14, March 5th to March 8th, 2014, Lyon, France

Author: STACS <31 2014, Lyon>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/03/2014
Field of study

Digitale Bibliothek Thüringen