22 research outputs found

    Multiobjective hypergraph-partitioning algorithms for cut and maximum subdomain-degree minimization

    Full text link

    Efficient Wave-based Sound Propagation and Optimization for Computer-Aided Design

    Get PDF
    Acoustic phenomena have a large impact on our everyday lives, from influencing our enjoyment of music in a concert hall, to affecting our concentration at school or work, to potentially negatively impacting our health through deafening noises. The sound that reaches our ears is absorbed, reflected, and filtered by the shape, topology, and materials present in the environment. However, many computer simulation techniques for solving these sound propagation problems are either computationally expensive or inaccurate. Additionally, the costs of some methods are dramatically increased in design optimization processes in which several iterations of sound propagation evaluation are necessary. The primary goal of this dissertation is to present techniques for efficiently solving the sound propagation problem and related optimization problems for computer-aided design. First, we propose a parallel method for solving large acoustic propagation problems, scalable to tens of thousands of cores. Second, we present two novel techniques for optimizing certain acoustic characteristics such as reverberation time or sound clarity using wave-based sound propagation. Finally, we show how hybrid sound propagation algorithms can be used to improve the performance of acoustic optimization problems and present two algorithms for noise minimization and speech intelligibility improvement that use this hybrid approach. All the algorithms we present are evaluated on various benchmarks that are computer models of architectural scenes. These benchmarks include challenging environments for existing sound propagation algorithms, such as large indoor or outdoor scenes, structural complex scenes, or the prevalence of difficult-to-model sound propagation phenomena. Using the techniques put forth in this dissertation, we can solve many challenging sound propagation and optimization problems on the scenes in an efficient manner. We are able to accurately model sound propagation using wave-based approaches up to \SI{10}{\kilo\hertz} (the full range of human speech) and for the full range of human hearing (22kHz) using our hybrid approach. Our noise minimization methods show improvements of up to 13dB in noise reduction on some scenes, and we show a 71\% improvement in speech intelligibility using our algorithm.Doctor of Philosoph

    Partitioning sparse matrices for parallel preconditioned iterative methods

    Get PDF
    This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that different methods impose different partitioning requirements for the matrices. Then we develop hypergraph models to meet those requirements. In particular, we develop models that enable us to obtain partitionings on the coefficient and preconditioner matrices simultaneously. Experiments on a set of unsymmetric sparse matrices show that the proposed models yield effective partitioning results. A parallel implementation of the right preconditioned BiCGStab method on a PC cluster verifies that the theoretical gains obtained by the models hold in practice. © 2007 Society for Industrial and Applied Mathematics

    MINIMIZATION OF RESOURCE CONSUMPTION THROUGH WORKLOAD CONSOLIDATION IN LARGE-SCALE DISTRIBUTED DATA PLATFORMS

    Get PDF
    The rapid increase in the data volumes encountered in many application domains has led to widespread adoption of parallel and distributed data management systems like parallel databases and MapReduce-based frameworks (e.g., Hadoop) in recent years. Use of such parallel and distributed frameworks is expected to accelerate in the coming years, putting further strain on already-scarce resources like compute power, network bandwidth, and energy. To reduce total execution times, there is a trend towards increasing execution parallelism by spreading out data across a large number of machines. However, this often increases the total resource consumption, and especially energy consumption, significantly because of process startup costs and other overheads (e.g., communication overheads). In this dissertation, we develop several data management techniques to minimize resource consumption through workload consolidation. In this dissertation, we introduce a key metric called query span, i.e., number of machines involved in the execution of a query or a job. In order to minimize the per query resource consumption we propose to minimize query span. To that end, we develop several workload-driven data partitioning and replica selection algorithms that attempt to minimize the average query span by exploiting the fact that most distributed environments need to use replication for fault tolerance. Extensive experiments on various datasets show that judicious data placement and replication can dramatically reduce the average query spans resulting in significant reductions in resource consumption. We show our results primarily on two applications, distributed data warehouse system and distributed information retrieval. In the first case, we show that minimizing average query spans can minimize overall resource consumption for a given workload and can also improve the performance of complex analytical queries. In the second case, our approach minimizes the overall search cost as well as effectively trades off search cost with load imbalance. The best case of resource efficiency for any underlying data processing system is achieved when the job or the query can be run efficiently on a single machine (i.e., query span=1). In the final part of dissertation, we discuss an in-memory MapReduce system optimized for performing complex analytics tasks on input data sizes that fit in a single machine's memory. We argue that systems like Hadoop that are designed to operate across a large number of machines are not optimal in performance for small and medium sized complex analytics tasks because of high startup costs, heavy disk activity, and wasteful checkpointing. We have developed a prototype runtime called HONE that is API compatible with standard (distributed) Hadoop. In other words, we can take existing Hadoop code and run it, without modification, on a multi-core shared memory machine. This allows us to take existing Hadoop algorithms and find the most suitable runtime environment for execution on datasets of varying sizes. Overall, in this dissertation, our key contributions in this work include identification of key metric query span and its relationship with overall resource consumption in scale-out architectures. We introduce several workload-aware techniques to optimize this key metric. We go on to demonstrate the effectiveness of query span minimization on different application scenarios. In order to take advantage of scale-up architectures effectively we develop novel in-memory MapReduce system HONE for single machine. Our thorough experiments on real and synthetic datasets demonstrate the efficacy of our proposed approaches

    High-Quality Hypergraph Partitioning

    Get PDF
    This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer kk, partition its vertex set into kk disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric. Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution. With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) nn levels, where nn is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the nn-level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve kk-way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine kk-way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space. All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time

    New strategies for the aerodynamic design optimization of aeronautical configurations through soft-computing techniques

    Get PDF
    Premio Extraordinario de Doctorado de la UAH en 2013Lozano Rodríguez, Carlos, codir.This thesis deals with the improvement of the optimization process in the aerodynamic design of aeronautical configurations. Nowadays, this topic is of great importance in order to allow the European aeronautical industry to reduce their development and operational costs, decrease the time-to-market for new aircraft, improve the quality of their products and therefore maintain their competitiveness. Within this thesis, a study of the state-of-the-art of the aerodynamic optimization tools has been performed, and several contributions have been proposed at different levels: -One of the main drawbacks for an industrial application of aerodynamic optimization tools is the huge requirement of computational resources, in particular, for complex optimization problems, current methodological approaches would need more than a year to obtain an optimized aircraft. For this reason, one proposed contribution of this work is focused on reducing the computational cost by the use of different techniques as surrogate modelling, control theory, as well as other more software-related techniques as code optimization and proper domain parallelization, all with the goal of decreasing the cost of the aerodynamic design process. -Other contribution is related to the consideration of the design process as a global optimization problem, and, more specifically, the use of evolutionary algorithms (EAs) to perform a preliminary broad exploration of the design space, due to their ability to obtain global optima. Regarding this, EAs have been hybridized with metamodels (or surrogate models), in order to substitute expensive CFD simulations. In this thesis, an innovative approach for the global aerodynamic optimization of aeronautical configurations is proposed, consisting of an Evolutionary Programming algorithm hybridized with a Support Vector regression algorithm (SVMr) as a metamodel. Specific issues as precision, dataset training size, geometry parameterization sensitivity and techniques for design of experiments are discussed and the potential of the proposed approach to achieve innovative shapes that would not be achieved with traditional methods is assessed. -Then, after a broad exploration of the design space, the optimization process is continued with local gradient-based optimization techniques for a finer improvement of the geometry. Here, an automated optimization framework is presented to address aerodynamic shape design problems. Key aspects of this framework include the use of the adjoint methodology to make the computational requirements independent of the number of design variables, and Computer Aided Design (CAD)-based shape parameterization, which uses the flexibility of Non-Uniform Rational B-Splines (NURBS) to handle complex configurations. The mentioned approach is applied to the optimization of several test cases and the improvements of the proposed strategy and its ability to achieve efficient shapes will complete this study

    New strategies for the aerodynamic design optimization of aeronautical configurations through soft-computing techniques

    Get PDF
    Premio Extraordinario de Doctorado de la UAH en 2013Lozano Rodríguez, Carlos, codir.This thesis deals with the improvement of the optimization process in the aerodynamic design of aeronautical configurations. Nowadays, this topic is of great importance in order to allow the European aeronautical industry to reduce their development and operational costs, decrease the time-to-market for new aircraft, improve the quality of their products and therefore maintain their competitiveness. Within this thesis, a study of the state-of-the-art of the aerodynamic optimization tools has been performed, and several contributions have been proposed at different levels: -One of the main drawbacks for an industrial application of aerodynamic optimization tools is the huge requirement of computational resources, in particular, for complex optimization problems, current methodological approaches would need more than a year to obtain an optimized aircraft. For this reason, one proposed contribution of this work is focused on reducing the computational cost by the use of different techniques as surrogate modelling, control theory, as well as other more software-related techniques as code optimization and proper domain parallelization, all with the goal of decreasing the cost of the aerodynamic design process. -Other contribution is related to the consideration of the design process as a global optimization problem, and, more specifically, the use of evolutionary algorithms (EAs) to perform a preliminary broad exploration of the design space, due to their ability to obtain global optima. Regarding this, EAs have been hybridized with metamodels (or surrogate models), in order to substitute expensive CFD simulations. In this thesis, an innovative approach for the global aerodynamic optimization of aeronautical configurations is proposed, consisting of an Evolutionary Programming algorithm hybridized with a Support Vector regression algorithm (SVMr) as a metamodel. Specific issues as precision, dataset training size, geometry parameterization sensitivity and techniques for design of experiments are discussed and the potential of the proposed approach to achieve innovative shapes that would not be achieved with traditional methods is assessed. -Then, after a broad exploration of the design space, the optimization process is continued with local gradient-based optimization techniques for a finer improvement of the geometry. Here, an automated optimization framework is presented to address aerodynamic shape design problems. Key aspects of this framework include the use of the adjoint methodology to make the computational requirements independent of the number of design variables, and Computer Aided Design (CAD)-based shape parameterization, which uses the flexibility of Non-Uniform Rational B-Splines (NURBS) to handle complex configurations. The mentioned approach is applied to the optimization of several test cases and the improvements of the proposed strategy and its ability to achieve efficient shapes will complete this study

    Cross-Layer Rapid Prototyping and Synthesis of Application-Specific and Reconfigurable Many-accelerator Platforms

    Get PDF
    Technological advances of recent years laid the foundation consolidation of informatisationof society, impacting on economic, political, cultural and socialdimensions. At the peak of this realization, today, more and more everydaydevices are connected to the web, giving the term ”Internet of Things”. The futureholds the full connection and interaction of IT and communications systemsto the natural world, delimiting the transition to natural cyber systems and offeringmeta-services in the physical world, such as personalized medical care, autonomoustransportation, smart energy cities etc. . Outlining the necessities of this dynamicallyevolving market, computer engineers are required to implement computingplatforms that incorporate both increased systemic complexity and also cover awide range of meta-characteristics, such as the cost and design time, reliabilityand reuse, which are prescribed by a conflicting set of functional, technical andconstruction constraints. This thesis aims to address these design challenges bydeveloping methodologies and hardware/software co-design tools that enable therapid implementation and efficient synthesis of architectural solutions, which specifyoperating meta-features required by the modern market. Specifically, this thesispresents a) methodologies to accelerate the design flow for both reconfigurableand application-specific architectures, b) coarse-grain heterogeneous architecturaltemplates for processing and communication acceleration and c) efficient multiobjectivesynthesis techniques both at high abstraction level of programming andphysical silicon level.Regarding to the acceleration of the design flow, the proposed methodologyemploys virtual platforms in order to hide architectural details and drastically reducesimulation time. An extension of this framework introduces the systemicco-simulation using reconfigurable acceleration platforms as co-emulation intermediateplatforms. Thus, the development cycle of a hardware/software productis accelerated by moving from a vertical serial flow to a circular interactive loop.Moreover the simulation capabilities are enriched with efficient detection and correctiontechniques of design errors, as well as control methods of performancemetrics of the system according to the desired specifications, during all phasesof the system development. In orthogonal correlation with the aforementionedmethodological framework, a new architectural template is proposed, aiming atbridging the gap between design complexity and technological productivity usingspecialized hardware accelerators in heterogeneous systems-on-chip and networkon-chip platforms. It is presented a novel co-design methodology for the hardwareaccelerators and their respective programming software, including the tasks allocationto the available resources of the system/network. The introduced frameworkprovides implementation techniques for the accelerators, using either conventionalprogramming flows with hardware description language or abstract programmingmodel flows, using techniques from high-level synthesis. In any case, it is providedthe option of systemic measures optimization, such as the processing speed,the throughput, the reliability, the power consumption and the design silicon area.Finally, on addressing the increased complexity in design tools of reconfigurablesystems, there are proposed novel multi-objective optimization evolutionary algo-rithms which exploit the modern multicore processors and the coarse-grain natureof multithreaded programming environments (e.g. OpenMP) in order to reduce theplacement time, while by simultaneously grouping the applications based on theirintrinsic characteristics, the effectively explore the design space effectively.The efficiency of the proposed architectural templates, design tools and methodologyflows is evaluated in relation to the existing edge solutions with applicationsfrom typical computing domains, such as digital signal processing, multimedia andarithmetic complexity, as well as from systemic heterogeneous environments, suchas a computer vision system for autonomous robotic space navigation and manyacceleratorsystems for HPC and workstations/datacenters. The results strengthenthe belief of the author, that this thesis provides competitive expertise to addresscomplex modern - and projected future - design challenges.Οι τεχνολογικές εξελίξεις των τελευταίων ετών έθεσαν τα θεμέλια εδραίωσης της πληροφοριοποίησης της κοινωνίας, επιδρώντας σε οικονομικές,πολιτικές, πολιτιστικές και κοινωνικές διαστάσεις. Στο απόγειο αυτής τη ςπραγμάτωσης, σήμερα, ολοένα και περισσότερες καθημερινές συσκευές συνδέονται στο παγκόσμιο ιστό, αποδίδοντας τον όρο «Ίντερνετ των πραγμάτων».Το μέλλον επιφυλάσσει την πλήρη σύνδεση και αλληλεπίδραση των συστημάτων πληροφορικής και επικοινωνιών με τον φυσικό κόσμο, οριοθετώντας τη μετάβαση στα συστήματα φυσικού κυβερνοχώρου και προσφέροντας μεταυπηρεσίες στον φυσικό κόσμο όπως προσωποποιημένη ιατρική περίθαλψη, αυτόνομες μετακινήσεις, έξυπνες ενεργειακά πόλεις κ.α. . Σκιαγραφώντας τις ανάγκες αυτής της δυναμικά εξελισσόμενης αγοράς, οι μηχανικοί υπολογιστών καλούνται να υλοποιήσουν υπολογιστικές πλατφόρμες που αφενός ενσωματώνουν αυξημένη συστημική πολυπλοκότητα και αφετέρου καλύπτουν ένα ευρύ φάσμα μεταχαρακτηριστικών, όπως λ.χ. το κόστος σχεδιασμού, ο χρόνος σχεδιασμού, η αξιοπιστία και η επαναχρησιμοποίηση, τα οποία προδιαγράφονται από ένα αντικρουόμενο σύνολο λειτουργικών, τεχνολογικών και κατασκευαστικών περιορισμών. Η παρούσα διατριβή στοχεύει στην αντιμετώπιση των παραπάνω σχεδιαστικών προκλήσεων, μέσω της ανάπτυξης μεθοδολογιών και εργαλείων συνσχεδίασης υλικού/λογισμικού που επιτρέπουν την ταχεία υλοποίηση καθώς και την αποδοτική σύνθεση αρχιτεκτονικών λύσεων, οι οποίες προδιαγράφουν τα μετα-χαρακτηριστικά λειτουργίας που απαιτεί η σύγχρονη αγορά. Συγκεκριμένα, στα πλαίσια αυτής της διατριβής, παρουσιάζονται α) μεθοδολογίες επιτάχυνσης της ροής σχεδιασμού τόσο για επαναδιαμορφούμενες όσο και για εξειδικευμένες αρχιτεκτονικές, β) ετερογενή αδρομερή αρχιτεκτονικά πρότυπα επιτάχυνσης επεξεργασίας και επικοινωνίας και γ) αποδοτικές τεχνικές πολυκριτηριακής σύνθεσης τόσο σε υψηλό αφαιρετικό επίπεδο προγραμματισμού,όσο και σε φυσικό επίπεδο πυριτίου.Αναφορικά προς την επιτάχυνση της ροής σχεδιασμού, προτείνεται μια μεθοδολογία που χρησιμοποιεί εικονικές πλατφόρμες, οι οποίες αφαιρώντας τις αρχιτεκτονικές λεπτομέρειες καταφέρνουν να μειώσουν σημαντικά το χρόνο εξομοίωσης. Παράλληλα, εισηγείται η συστημική συν-εξομοίωση με τη χρήση επαναδιαμορφούμενων πλατφορμών, ως μέσων επιτάχυνσης. Με αυτόν τον τρόπο, ο κύκλος ανάπτυξης ενός προϊόντος υλικού, μετατεθειμένος από την κάθετη σειριακή ροή σε έναν κυκλικό αλληλεπιδραστικό βρόγχο, καθίσταται ταχύτερος, ενώ οι δυνατότητες προσομοίωσης εμπλουτίζονται με αποδοτικότερες μεθόδους εντοπισμού και διόρθωσης σχεδιαστικών σφαλμάτων, καθώς και μεθόδους ελέγχου των μετρικών απόδοσης του συστήματος σε σχέση με τις επιθυμητές προδιαγραφές, σε όλες τις φάσεις ανάπτυξης του συστήματος. Σε ορθογώνια συνάφεια με το προαναφερθέν μεθοδολογικό πλαίσιο, προτείνονται νέα αρχιτεκτονικά πρότυπα που στοχεύουν στη γεφύρωση του χάσματος μεταξύ της σχεδιαστικής πολυπλοκότητας και της τεχνολογικής παραγωγικότητας, με τη χρήση συστημάτων εξειδικευμένων επιταχυντών υλικού σε ετερογενή συστήματα-σε-ψηφίδα καθώς και δίκτυα-σε-ψηφίδα. Παρουσιάζεται κατάλληλη μεθοδολογία συν-σχεδίασης των επιταχυντών υλικού και του λογισμικού προκειμένου να αποφασισθεί η κατανομή των εργασιών στους διαθέσιμους πόρους του συστήματος/δικτύου. Το μεθοδολογικό πλαίσιο προβλέπει την υλοποίηση των επιταχυντών είτε με συμβατικές μεθόδους προγραμματισμού σε γλώσσα περιγραφής υλικού είτε με αφαιρετικό προγραμματιστικό μοντέλο με τη χρήση τεχνικών υψηλού επιπέδου σύνθεσης. Σε κάθε περίπτωση, δίδεται η δυνατότητα στο σχεδιαστή για βελτιστοποίηση συστημικών μετρικών, όπως η ταχύτητα επεξεργασίας, η ρυθμαπόδοση, η αξιοπιστία, η κατανάλωση ενέργειας και η επιφάνεια πυριτίου του σχεδιασμού. Τέλος, προκειμένου να αντιμετωπισθεί η αυξημένη πολυπλοκότητα στα σχεδιαστικά εργαλεία επαναδιαμορφούμενων συστημάτων, προτείνονται νέοι εξελικτικοί αλγόριθμοι πολυκριτηριακής βελτιστοποίησης, οι οποίοι εκμεταλλευόμενοι τους σύγχρονους πολυπύρηνους επεξεργαστές και την αδρομερή φύση των πολυνηματικών περιβαλλόντων προγραμματισμού (π.χ. OpenMP), μειώνουν το χρόνο επίλυσης του προβλήματος της τοποθέτησης των λογικών πόρων σε φυσικούς,ενώ ταυτόχρονα, ομαδοποιώντας τις εφαρμογές βάση των εγγενών χαρακτηριστικών τους, διερευνούν αποτελεσματικότερα το χώρο σχεδίασης.Η αποδοτικότητά των προτεινόμενων αρχιτεκτονικών προτύπων και μεθοδολογιών επαληθεύτηκε σε σχέση με τις υφιστάμενες λύσεις αιχμής τόσο σε αυτοτελής εφαρμογές, όπως η ψηφιακή επεξεργασία σήματος, τα πολυμέσα και τα προβλήματα αριθμητικής πολυπλοκότητας, καθώς και σε συστημικά ετερογενή περιβάλλοντα, όπως ένα σύστημα όρασης υπολογιστών για αυτόνομα διαστημικά ρομποτικά οχήματα και ένα σύστημα πολλαπλών επιταχυντών υλικού για σταθμούς εργασίας και κέντρα δεδομένων, στοχεύοντας εφαρμογές υψηλής υπολογιστικής απόδοσης (HPC). Τα αποτελέσματα ενισχύουν την πεποίθηση του γράφοντα, ότι η παρούσα διατριβή παρέχει ανταγωνιστική τεχνογνωσία για την αντιμετώπιση των πολύπλοκων σύγχρονων και προβλεπόμενα μελλοντικών σχεδιαστικών προκλήσεων
    corecore