11 research outputs found

    Working With Incremental Spatial Data During Parallel (GPU) Computation

    Get PDF
    Central to many complex systems, spatial actors require an awareness of their local environment to enable behaviours such as communication and navigation. Complex system simulations represent this behaviour with Fixed Radius Near Neighbours (FRNN) search. This algorithm allows actors to store data at spatial locations and then query the data structure to find all data stored within a fixed radius of the search origin. The work within this thesis answers the question: What techniques can be used for improving the performance of FRNN searches during complex system simulations on Graphics Processing Units (GPUs)? It is generally agreed that Uniform Spatial Partitioning (USP) is the most suitable data structure for providing FRNN search on GPUs. However, due to the architectural complexities of GPUs, the performance is constrained such that FRNN search remains one of the most expensive common stages between complex systems models. Existing innovations to USP highlight a need to take advantage of recent GPU advances, reducing the levels of divergence and limiting redundant memory accesses as viable routes to improve the performance of FRNN search. This thesis addresses these with three separate optimisations that can be used simultaneously. Experiments have assessed the impact of optimisations to the general case of FRNN search found within complex system simulations and demonstrated their impact in practice when applied to full complex system models. Results presented show the performance of the construction and query stages of FRNN search can be improved by over 2x and 1.3x respectively. These improvements allow complex system simulations to be executed faster, enabling increases in scale and model complexity

    Scalable multimedia indexing and similarity search in high dimensionality

    Get PDF
    Orientador: Ricardo da Silva TorresDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A disseminação de grandes coleções de arquivos de imagens, músicas e vídeos tem aumentado a demanda por métodos de indexação e sistemas de recuperação de informações multimídia. No caso de imagens, os sistemas de busca mais promissores são os sistemas baseados no conteúdo, que ao invés de usarem descrições textuais, utilizam vetores de características, que são representações de propriedades visuais, como cor, textura e forma. O emparelhamento dos vetores de características da imagem de consulta e das imagens de uma base de dados é implementado através da busca por similaridade. A sua forma mais comum é a busca pelos k vizinhos mais próximos, ou seja, encontrar os k vetores mais próximos ao vetor da consulta. Em grandes bases de imagens, um índice é indispensável para acelerar essas consultas. O problema é que os vetores de características podem ter muitas dimensões, o que afeta gravemente o desempenho dos métodos de indexação. Acima de 10 dimensões, geralmente é preciso recorrer aos métodos aproximados, sacrificando a eficácia em troca da rapidez. Dentre as diversas soluções propostas, existe uma abordagem baseada em curvas fractais chamadas curvas de preenchimento do espaço. Essas curvas permitem mapear pontos de um espaço multidimensional em uma única dimensão, de maneira que os pontos próximos na curva correspondam a pontos próximos no espaço. O grande problema dessa alternativa é a existência de regiões de descontinuidade nas curvas, pontos próximos dessas regiões não são mapeados próximos na curva. A principal contribuição deste trabalho é um método de indexação de vetores de características de alta dimensionalidade, que utiliza uma curva de preenchimento do espaço e múltiplos representantes para os dados. Esse método, chamado MONORAIL, gera os representantes explorando as propriedades geométricas da curva. Isso resulta em um ganho na eficácia da busca por similaridade, quando comparado com o método de referência. Outra contribuição não trivial deste trabalho é o rigor experimental usado nas comparações: os experimentos foram cuidadosamente projetados para garantir resultados estatisticamente significativos. A escalabilidade do MONORAIL é testada com três bases de dados de tamanhos diferentes, a maior delas com mais de 130 milhões de vetoresAbstract: The spread of large collections of images, videos and music has increased the demand for indexing methods and multimedia information retrieval systems. For images, the most promising search engines are content-based, which instead of using textual annotations, use feature vectors to represent visual properties such as color, texture, and shape. The matching of feature vectors of query image and database images is implemented by similarity search. Its most common form is the k nearest neighbors search, which aims to find the k closest vectors to the query vector. In large image databases, an index structure is essential to speed up those queries. The problem is that the feature vectors may have many dimensions, which seriously affects the performance of indexing methods. For more than 10 dimensions, it is often necessary to use approximate methods to trade-off effectiveness for speed. Among the several solutions proposed, there is an approach based on fractal curves known as space-filling curves. Those curves allow the mapping of a multidimensional space onto a single dimension, so that points near on the curve correspond to points near on the space. The great problem with that alternative is the existence of discontinuity regions on the curves, where points near on those regions are not mapped near on the curve. The main contribution of this dissertation is an indexing method for high-dimensional feature vectors, using a single space-filling curve and multiple surrogates for each data point. That method, called MONORAIL, generates surrogates by exploiting the geometric properties of the curve. The result is a gain in terms of effectiveness of similarity search, when compared to the baseline method. Another non-trivial contribution of this work is the rigorous experimental design used for the comparisons. The experiments were carefully designed to ensure statistically sound results. The scalability of the MONORAIL is tested with three databases of different sizes, the largest one with more than 130 million vectorsMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

    Using group role assignment to solve Dynamic Vehicle Routing Problem

    Get PDF
    The Dynamic Vehicle Routing Problem (DVRP) is a more complex problem than the traditional Vehicle Routing Problem (VRP) in the combinatorial optimization of operations research. With more degrees of freedom, DVRP introduces new challenges while judging the merit of a given route plan. This thesis utilized the time slice strategy to solve dynamic and deterministic routing problems. Based on Group Role Assignment (GRA) and two different routing methods (Modified Insertion heuristic routing and Modified Composite Pairing Or-opt routing), a new ridesharing system has been designed to provide services in the real world. Simulation results are presented in this thesis. A qualitative comparison has been made to outline the advantages and performance of our solution framework. From the numerical results, the proposed method has a great potential to put into operation in the real world and provides a new transit option for the public.Master of Science (MSc) in Computational Scienc

    Traveling Salesman Problem

    Get PDF
    The idea behind TSP was conceived by Austrian mathematician Karl Menger in mid 1930s who invited the research community to consider a problem from the everyday life from a mathematical point of view. A traveling salesman has to visit exactly once each one of a list of m cities and then return to the home city. He knows the cost of traveling from any city i to any other city j. Thus, which is the tour of least possible cost the salesman can take? In this book the problem of finding algorithmic technique leading to good/optimal solutions for TSP (or for some other strictly related problems) is considered. TSP is a very attractive problem for the research community because it arises as a natural subproblem in many applications concerning the every day life. Indeed, each application, in which an optimal ordering of a number of items has to be chosen in a way that the total cost of a solution is determined by adding up the costs arising from two successively items, can be modelled as a TSP instance. Thus, studying TSP can never be considered as an abstract research with no real importance

    Path planning algorithms for atmospheric science applications of autonomous aircraft systems

    No full text
    Among current techniques, used to assist the modelling of atmospheric processes, is an approach involving the balloon or aircraft launching of radiosondes, which travel along uncontrolled trajectories dependent on wind speed. Radiosondes are launched daily from numerous worldwide locations and the data collected is integral to numerical weather prediction.This thesis proposes an unmanned air system for atmospheric research, consisting of multiple, balloon-launched, autonomous gliders. The trajectories of the gliders are optimised for the uniform sampling of a volume of airspace and the efficient mapping of a particular physical or chemical measure. To accomplish this we have developed a series of algorithms for path planning, driven by the dual objectives of uncertainty andinformation gain.Algorithms for centralised, discrete path planning, a centralised, continuous planner and finally a decentralised, real-time, asynchronous planner are presented. The continuous heuristics search a look-up table of plausible manoeuvres generated by way of an offline flight dynamics model, ensuring that the optimised trajectories are flyable. Further to this, a greedy heuristic for path growth is introduced alongside a control for search coarseness, establishing a sliding control for the level of allowed global exploration, local exploitation and computational complexity. The algorithm is also integrated with a flight dynamics model, and communications and flight systems hardware, enabling software and hardware-in-the-loop simulations. The algorithm outperforms random search in two and three dimensions. We also assess the applicability of the unmanned air system in ‘real’ environments, accounting for the presence of complicated flow fields and boundaries. A case study based on the island South Georgia is presented and indicates good algorithm performance in strong, variable winds. We also examine the impact of co-operation within this multi-agent system of decentralised, unmanned gliders, investigating the threshold for communication range, which allows for optimal search whilst reducing both the cost of individual communication devices and the computational resources associated with the processing of data received by each aircraft. Reductions in communication radius are found to have a significant, negative impact upon the resulting efficiency of the system. To somewhat recover these losses, we utilise a sorting algorithm, determining information priority between any two aircraft in range. Furthermore, negotiation between aircraft is introduced, allowing aircraft to resolve any possible conflicts between selected paths, which helps to counteractany latency in the search heuristic

    Order picking optimisation on a unidirectional cyclical picking line

    Get PDF
    Thesis (PhD)--Stellenbosch University, 2020.ENGLISH SUMMARY : The order picking system in a company's distribution centre is the biggest contributor to the operational cost within the DC. Optimisation should thus aim at running this activity as effciently as possible. The order picking process consists of three main activities, namely walking to the stock, picking stock in fullment of a customer order and handling the picked stock for further processing. While the total amount of work for the picking and handling activities remain constant, the minimisation of walking distance becomes the main objective when minimising the total picking effort. The minimisation of walking distance can be translated into a reduced overall picking time which can lead to a decrease in the total cost of operating the picking system. The main objective of this dissertation is to optimise the order picking system on a unidirectional cyclical picking line. Order batching is introduced to the picking system, since it is an effective methodology that minimises walking distance in operations research literature. Order batching has been introduced to the standard single block parallel-aisle warehouse layout, but not to the specic layout of a unidirectional cyclical picking line. Additionally, the unidirectional cyclical picking line can offer two conguration options that change the physical set up and thereby inffuence the way in which pickers walk during the order picking process. Order batching is introduced to the unidirectional cyclical picking line through picking location based order-to-route closeness metrics. These metrics are further extended by taking the characteristics of the layout into account. The distribution centre of a prominent South African retailer provides real life test instances. Introducing the layout specic stops non-identical spans metric in combination with the greedy smallest entry heuristic results in a reduction of 48:3% in walking distance. Order batching increases the pick density which may lead to higher levels in picker congestion. In a discrete event simulation, the reduction of the overall picking time through a decrease in walking distance is thus conrmed. On tested sample picking waves, the overall picking time can be reduced by up to 21% per wave. A good number of pickers in the picking system is dependent on the pick density. The pick density, amongst other explanatory variables, can also be used to predict the reduction in picking time. The effects of different structural options of the unidirectional cyclical picking line, namely the U- and Z-conguration, are investigated. This results in four decision tiers that have to be addressed while optimising the order picking system. The rst decision tier assigns stock to picking lines, the second arranges stock around a picking line, the third chooses the conguration and the last sequences the orders to be picked. Order batching is added as an additional layer. An increase in pick density benets the reduction of walking distance throughout the decision tiers and supports the choice of the U-conguration after evaluating different test instances. The total completion time of a picking wave can thus be reduced by up to 28% when compared to benchmark instances. The dissertation is concluded by suggesting further research directions.AFRIKAANSE OPSOMMING : Die opmaak van bestellings op 'n uitsoeklyn in 'n onderneming se distribusiesentrum is die grootste bydraer tot die bedryfskoste van 'n distribusiesentrum. Dit is dus belangrik om hierdie aktiwiteit so doeltreffend moontlik te maak. Die proses om bestellings op te maak bestaan uit drie hoofaktiwiteite, naamlik stap na die voorraad, uitsoek (kies en bymekaarsit) van die voorraad vir 'n bestelling en die pak van die gekose voorraad in kartonne vir verdere verwerking en verspreiding. Omdat die totale hoeveelheid werk vir die uitsoek- en hanteringsaktiwiteite konstant bly, word die vermindering van loopafstand die hoofdoelwit om die totale koste van hierdie proses te minimeer. Die minimering van loopafstand lei tot 'n vermindering in totale tyd om bestellings op te maak, wat op sy beurt weer lei tot 'n afname in die totale koste van die stelsel om bestellings op te maak. Die hoofdoel van hierdie proefskrif is om die stelsel vir die uitsoek van bestellings op 'n eenrigting sikliese uitsoeklyn te optimeer. Metodes vir die samevoeging of groepering (Eng.: batching) van bestellings (om gelyktydig opgemaak te word) word ontwikkel vir hierdie uitsoekstelsel aangesien operasionelenavorsingsliteratuur aantoon dat groepering van bestellings 'n effektiewe metode is om loopafstand te verminder. Groepering van bestellings is reeds gedoen vir die standaard blokuitleg van distribusiesentra, maar nie vir hierdie spesieke uitleg van 'n eenrigting sikliese uitsoeklyn nie. Daarbenewens het die eenrigting sikliese uitsoeklyn twee kongurasie-opsies wat die siese opstelling verander en sodoende die manier beinvloed waarop werkers tydens die uitsoekproses loop. Die groepering van bestellings word ontwikkel vir 'n eenrigting sikliese uitsoeklyn deur middel van 'n plek-gebaseerde maatstaf wat die nabyheid van bestellings se roetes meet. Hierdie maatstaf word verder uitgebrei deur die eienskappe van die uitleg in ag te neem. Regte voorbeelde van die probleem uit 'n distribusiesentrum van 'n prominente Suid-Afrikaanse kleinhandelaar word gebruik vir toetsing. Die ontwikkeling en implementering van 'n uitlegspesieke stop-nie identiese-strek-maatstaf in kombinasie met die gulsige kleinste-invoegingsheuristiek lei tot 'n vermindering van 48:3% in stapafstand. Die groepering van bestellings verhoog die digtheid van plekke waar werkers stop vir voorraad, wat kan lei tot ho er vlakke van kongestie vir werkers. 'n Diskrete-gebeurtenis-simulasie bevestig dat 'n afname in loopafstand ook 'n vermindering van die totale voltooiingstyd tot gevolg het. Met behulp van werklike historiese data kon die totale tyd vir die uitsoek van bestellings met tot 21% per golf verminder word. 'n Goeie aantal werkers in die uitsoekstelsel is afhanklik van die uitsoekdigtheid. Die uitsoekdigtheid en andere verklarende veranderlikes, kan ook gebruik word om die vermindering in totale tyd om bestellings op te maak, te voorspel. Die invloed van verskillende strukturele opsies van die eenrigting sikliese uitsoeklyn, naamlik die U- en Z-kongurasie, word ook ondersoek. Dit het tot gevolg dat vier besluitnemingsvlakke aangespreek moet word om die uitsoekstelsel te optimeer. Die eerste besluitnemingsvlak ken voorraad aan die uitsoeklyne toe, die tweede rangskik voorraad binne die uitsoeklyn, die derde kies die kongurasie van die lyn en die laaste kies die volgorde waarin die bestellings uitgesoek word. Groepering van bestellings word bygevoeg as 'n addisionele vlak. 'n Toename in werksdigtheid bevoordeel die vermindering van loopafstand deur die besluitvlakke en bevoordeel die U-kongurasie na evaluering van verskillende toetsdata. Die totale voltooiingstyd van 'n uitsoekgolf kan dus verminder word met tot 28% in vergelyking met eweknie voorbeelde. Die studie word afgesluit deur verdere navorsingsmoontlikhede voor te stel.Doctora

    Efficient Domain Partitioning for Stencil-based Parallel Operators

    Get PDF
    Partial Differential Equations (PDEs) are used ubiquitously in modelling natural phenomena. It is generally not possible to obtain an analytical solution and hence they are commonly discretized using schemes such as the Finite Difference Method (FDM) and the Finite Element Method (FEM), converting the continuous PDE to a discrete system of sparse algebraic equations. The solution of this system can be approximated using iterative methods, which are better suited to many sparse systems than direct methods. In this thesis we use the FDM to discretize linear, second order, Elliptic PDEs and consider parallel implementations of standard iterative solvers. The dominant paradigm in this field is distributed memory parallelism which requires the FDM grid to be partitioned across the available computational cores. The orthodox approach to domain partitioning aims to minimize only the communication volume and achieve perfect load-balance on each core. In this work, we re-examine and challenge this traditional method of domain partitioning and show that for well load-balanced problems, minimizing only the communication volume is insufficient for obtaining optimal domain partitions. To this effect we create a high-level, quasi-cache-aware mathematical model that quantifies cache-misses at the sub-domain level and minimizes them to obtain families of high performing domain decompositions. To our knowledge this is the first work that optimizes domain partitioning by analyzing cache misses, establishing a relationship between cache-misses and domain partitioning. To place our model in its true context, we identify and qualitatively examine multiple other factors such as the Least Recently Used policy, Cache Line Utilization and Vectorization, that influence the choice of optimal sub-domain dimensions. Since the convergence rate of point iterative methods, such as Jacobi, for uniform meshes is not acceptable at a high mesh resolution, we extend the model to Parallel Geometric Multigrid (GMG). GMG is a multilevel, iterative, optimal algorithm for numerically solving Elliptic PDEs. Adaptive Mesh Refinement (AMR) is another multilevel technique that allows local refinement of a global mesh based on parameters such as error estimates or geometric importance. We study a massively parallel, multiphysics, multi-resolution AMR framework called BoxLib, and implement and discuss our model on single level and adaptively refined meshes, respectively. We conclude that “close to 2-D” partitions are optimal for stencil-based codes on structured 3-D domains and that it is necessary to optimize for both minimizing cache-misses and communication. We advise that in light of the evolving hardware-software ecosystem, there is an imperative need to re-examine conventional domain partitioning strategies

    Development of tomographic reconstruction methods in materials science with focus on advanced scanning methods

    Get PDF

    Algorithms for Game-Theoretic Environments

    Get PDF
    Game Theory constitutes an appropriate way for approaching the Internet and modelling situations where participants interact with each other, such as networking, online auctions and search engine’s page ranking. Mechanism Design deals with the design of private-information games and attempts implementing desired social choices in a strategic setting. This thesis studies how the efficiency of a system degrades due to the selfish behaviour of its agents, expressed in terms of the Price of Anarchy (PoA). Our objective is to design mechanisms with improved PoA, or to determine the exact value of the PoA for existing mechanisms for two well-known problems, Auctions and Network Cost-Sharing Design. We study three different settings of auctions, combinatorial auction, multi- unit auction and bandwidth allocation. The combinatorial auction constitutes a fundamental resource allocation problem that involves the interaction of selfish agents in competition for indivisible goods. Although it is well-known that by using the VCG mechanism the selfishness of the agents does not affect the efficiency of the system, i.e. the social welfare is maximised, this mechanism cannot generally be applied in computationally tractable time. In practice, several simple auctions (lacking some nice properties of the VCG) are used, such as the generalised second price auction on AdWords, the simultaneous ascending price auction for spectrum allocation, and the independent second-price auction on eBay. The latter auction is of particular interest in this thesis. Precisely, we give tight bounds on the PoA when the goods are sold in independent and simultaneous first-price auctions, where the highest bidder gets the item and pays her own bid. Then, we generalise our results to a class of auctions that we call bid-dependent auctions, where the goods are also sold in independent and simultaneous auctions and further the payment of each bidder is a function of her bid, even if she doesn’t get the item. Overall, we show that the first-price auction is optimal among all bid-dependent auctions. The multi-unit auction is a special case of combinatorial auction where all items are identical. There are many variations: the discriminatory auction, the uniform price auction and the Vickrey multi-unit auction. In all those auctions, the goods are allocated to the highest marginal bids, and their difference lies on the pricing scheme. Our focus is on the discriminatory auction, which can be seen as the variant of the first-price auction adjusted to multi-unit auctions. The bandwidth allocation is equivalent to auctioning divisible resources. Allocating network resources, like bandwidth, among agents is a canonical problem in the network optimisation literature. A traditional model for this problem was proposed by Kelly [1997], where each agent receives a fraction of the resource proportional to her bid and pays her own bid. We complement the PoA bounds known in the literature and give tight bounds for a more general case. We further show that this mechanism is optimal among a wider class of mechanisms. We further study design issues for network games: given a rooted undirected graph with nonnegative edge costs, a set of players with terminal vertices need to establish connectivity with the root. Each player selects a path and the global objective is to minimise the cost of the used edges. The cost of an edge may represent infrastructure cost for establishing connectivity or renting expense, and needs to be covered by the users. There are several ways to split the edge cost among its users and this is dictated by a cost-sharing protocol. Naturally, it is in the players best interest to choose paths that charge them with small cost. The seminal work of Chen et al. [2010] was the first to address design questions for this game. They thoroughly studied the PoA for the following informational assumptions. i) The designer has full knowledge of the instance, that is, she knows both the network topology and the players’ terminals. ii) The designer has no knowledge of the underlying graph. Arguably, there are situations where the former assumption is too optimistic while the latter is too pessimistic. We propose a model that lies in the middle-ground; the designer has prior knowledge of the underlying metric, but knows nothing about the positions of the terminals. Her goal is to process the graph and choose a universal cost-sharing protocol that has low PoA against all possible requested subsets. The main question is to what extent prior knowledge of the underlying metric can help in the design. We first demonstrate that there exist graph metrics where knowledge of the underlying metric can dramatically improve the performance of good network cost-sharing design. However, in our main technical result, we show that there exist graph metrics for which knowing the underlying metric does not help and any universal protocol matches the bound of Chen et al. [2010] which ignores the graph metric. We further study the stochastic and Bayesian games where the players choose their terminals according to a probability distribution. We showed that in the stochastic setting there exists a priority protocol that achieves constant PoA, whereas the PoA under the the Bayesian setting can be very high for any cost- sharing protocol satisfying some natural properties
    corecore