1,103 research outputs found
Classical and quantum algorithms for scaling problems
This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Ordonnancement sous contrainte mémoire en domptant la localité des données dans un modèle de programmation à base de tâches
International audienceA now-classical way of meeting the increasing demand for computing speed by HPC applications is the use of GPUs and/or otheraccelerators. Such accelerators have their own memory, which is usually quite limited, and are connected to the main memorythrough a bus with bounded bandwidth. Thus, particular care should be devoted to data locality in order to avoid unnecessary datamovements. Task-based runtime schedulers have emerged as a convenient and efficient way to use such heterogeneous platforms.When processing an application, the scheduler has the knowledge of all tasks available for processing on a GPU, as well astheir input data dependencies. Hence, it is possible to produce a tasks processing order aiming at reducing the total processingtime through three objectives: minimizing data transfers, overlapping transfers and computation and optimizing the eviction ofpreviously-loaded data. In this paper, we focus on how to schedule tasks that share some of their input data (but are otherwiseindependent) on a single GPU. We provide a formal model of the problem, exhibit an optimal eviction strategy, and show thatordering tasks to minimize data movement is NP-complete. We review and adapt existing ordering strategies to this problem,and propose a new one based on task aggregation. We prove that the underlying problem of this new strategy is NP-complete,and prove the reasonable complexity of our proposed heuristic. These strategies have been implemented in the StarPU runtimesystem. We present their performance on tasks from tiled 2D, 3D matrix products, Cholesky factorization, randomized task order,randomized data pairs from the 2D matrix product as well as a sparse matrix product. We introduce a visual way to understandthese performance and lower bounds on the number of data loads for the 2D and 3D matrix products. Our experiments demonstratethat using our new strategy together with the optimal eviction policy reduces the amount of data movement as well as the totalprocessing time
Análisis numérico de morfodinámica en perfiles de playa durante eventos episódicos con CFD
Beaches are highly valuable assets from an economical, social and environmental perspective, and understanding them is fundamental to develop effective strategies for coastal management. Morphodynamic processes have a fundamental role in driving the evolution of beaches, being responsible for the morphology changes associated to wave action. However, the ways in which these processes are generated are not sufficiently understood. This thesis contributes to expanding the knowledge on the interplay between hydrodynamics and morphology that bring about morphodynamic processes and what are their main drivers. To tackle this objective, a CFD numerical model for cross-shore morphodynamics is developed and validated, and results from it are used as a basis for a detailed analysis of the main cross-shore morphodynamic processes.Las playas son entornos de alto valor económico, social y ambiental. Comprender su funcionamiento es fundamental para desarrollar estrategias de gestión efectivas. Los llamados procesos morfodinámicos juegan un papel clave en su comportamiento, dado que son los responsables de los cambios morfológicos derivados de la acción del oleaje. Sin embargo, existe una falta de conocimiento acerca de cómo éstos se producen. Esta tesis contribuye a expandir el conocimiento en estos procesos, concretamente durante la ocurrencia de eventos episódicos, analizando las interacciones entre hidrodinámica y morfología que los origina, así como sus principales condicionantes. Para ello, se desarrolla y valida un modelo numérico CFD que permite reproducir los principales procesos morfodinámicos en perfiles de playa. Los resultados obtenidos de este modelo se emplean como base para un análisis detallado de procesos morfodinámicos en perfiles de playa
A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives
Graph-related applications have experienced significant growth in academia
and industry, driven by the powerful representation capabilities of graph.
However, efficiently executing these applications faces various challenges,
such as load imbalance, random memory access, etc. To address these challenges,
researchers have proposed various acceleration systems, including software
frameworks and hardware accelerators, all of which incorporate graph
pre-processing (GPP). GPP serves as a preparatory step before the formal
execution of applications, involving techniques such as sampling, reorder, etc.
However, GPP execution often remains overlooked, as the primary focus is
directed towards enhancing graph applications themselves. This oversight is
concerning, especially considering the explosive growth of real-world graph
data, where GPP becomes essential and even dominates system running overhead.
Furthermore, GPP methods exhibit significant variations across devices and
applications due to high customization. Unfortunately, no comprehensive work
systematically summarizes GPP. To address this gap and foster a better
understanding of GPP, we present a comprehensive survey dedicated to this area.
We propose a double-level taxonomy of GPP, considering both algorithmic and
hardware perspectives. Through listing relavent works, we illustrate our
taxonomy and conduct a thorough analysis and summary of diverse GPP techniques.
Lastly, we discuss challenges in GPP and potential future directions
Tailoring structures using stochastic variations of structural parameters.
Imperfections, meaning deviations from an idealized structure, can manifest through unintended variations in a structure’s geometry or material properties. Such imperfections affect the stiffness properties and can change the way structures behave under load. The magnitude of these effects determines how reliable and robust a structure is under loading.
Minor changes in geometry and material properties can also be added intentionally, creating a more beneficial load response or making a more robust structure. Examples of this are variable stiffness composites, which have varying fiber paths, or structures with thickened patches.
The work presented in this thesis aims to introduce a general approach to creating geodesic random fields in finite elements and exploiting these to improve designs. Random fields can be assigned to a material or geometric parameter. Stochastic analysis can then quantify the effects of variations on a structure for a given type of imperfection.
Information extracted from the effects of imperfections can also identify areas critical to a structure’s performance. Post-processing stochastic results by computing the correlation between local changes and the structural performance result in a pattern, describing the effects of local changes. Perturbing the ideal deterministic geometry or material distribution of a structure using the pattern of local influences can increase performance. Examples demonstrate the approach by increasing the deterministic (without imperfections applied) linear buckling load, fatigue life, and post-buckling path of structures.
Deterministic improvements can have a detrimental effect on the robustness of a structure. Increasing the amplitude of perturbation applied to the original design can improve the robustness of a structure’s response. Robustness analyses on a curved composite panel show that increasing the amplitude of design changes makes a structure less sensitive to variations. The example studied shows that an increase in robustness comes with a relatively small decrease in the deterministic improvement.Imperfektionen, d. h. die Abweichungen von einer idealisierten Struktur,
können sich durch unbeabsichtigte Variationen in der Geometrie oder
den Materialeigenschaften einer Struktur ergeben. Solche Imperfektionen
wirken sich auf die Steifigkeitseigenschaften aus und können das Verhalten
von Strukturen unter Last verändern. Das Ausmaß dieser Auswirkungen
bestimmt, wie zuverlässig und robust eine Struktur unter Belastung ist.
Kleine Änderungen der Geometrie und der Materialeigenschaften können
auch absichtlich eingebaut werden, um ein verbessertes Lastverhalten zu
erreichen oder eine stabilere Struktur zu schaffen. Beispiele hierfür sind Verbundwerkstoffe
mit variabler Steifigkeit, die unterschiedliche Faserverläufe
aufweisen, oder Strukturen mit lokalen Verstärkungen.
Die in dieser Dissertation vorgestellte Arbeit zielt darauf ab, einen allgemeinen
Ansatz zur Erstellung geodätischer Zufallsfelder in Finiten Elementen
zu entwickeln und diese zur Verbesserung von Konstruktionen zu
nutzen. Zufallsfelder können Material- oder Geometrieparametern zugeordnet
werden. Die stochastische Analyse kann dann die Auswirkungen
von Variationen auf eine Struktur für eine bestimmte Art von Imperfektion
quantifizieren.
Die aus den Auswirkungen von Imperfektionen gewonnenen Informationen
können auch Bereiche identifizieren, die für das Tragvermögen
einer Struktur kritisch sind. Die Auswertung der stochastischen Ergebnisse
durch Berechnung der Korrelation zwischen lokalen Veränderungen und
Strukturtragvermögen ergibt ein Muster, das die Auswirkungen lokaler
Veränderungen beschreibt. Die Perturbation der idealen deterministischen
Geometrie oder der Materialverteilung einer Struktur unter Verwendung
des Musters der lokalen Einflüsse kann das Tragvermögen erhöhen. Anhand
von Beispielen wird der Ansatz durch die Erhöhung der deterministischen
(ohne Imperfektionen) linearen Knicklast, der Lebensdauer und des Nachknickverhaltens
von Strukturen aufgezeigt.
Deterministische Verbesserungen können sich zum Nachteil der Robustheit
einer Struktur auswirken. Eine Vergrößerung der Amplitude der auf
den ursprünglichen Designentwurf angewendeten Perturbation kann die
Robustheit der Reaktion einer Struktur verbessern. Robustheitsanalysen an
einer gekrümmten Verbundplatte zeigen, dass eine Struktur durch eine Vergrößerung
der Amplitude der Entwurfsänderungen weniger empfindlich gegenüber Abweichungen wird. Das untersuchte Beispiel zeigt, dass eine
Erhöhung der Robustheit mit einem relativ geringen Verlust der deterministischen
Verbesserung eingeht
Heuristics and metaheuristics in the design of sound-absorbing porous materials
Inexact optimisation techniques such as heuristics and metaheuristics that quickly find near-optimal solutions are widely used to solve hard problems. While metaheuristics are well studied on specific problem domains such as travelling salesman, timetabling, vehicle routing etc., their extension to engineering domains is largely unexplored due to the requirement of domain expertise. In this thesis, we address a specific engineering domain: the design of sound-absorbing porous materials. Porous materials are foams, fibrous materials, woven and non-woven textiles, etc., that are widely used in automotive, aerospace and household applications to isolate and absorb noise to prevent equipment damage, protect hearing or ensure comfort. These materials constitute a significant amount of dead weight in aircraft and space applications, and choosing sub-optimal designs would lead to inefficiency and increased costs. By carefully choosing the material properties and shapes of these materials, favourable resonances can be created making it possible to improve absorption while also reducing weight. The optimisation problem structure is yet to be well-explored and not many comparison studies are available in this domain. This thesis aims to address the knowledge gap by analysing the performance of existing and novel heuristic and metaheuristic methods. Initially, the problem structure is explored by considering a one-dimensional layered sound package problem. Then, the challenging two-dimensional foam shape and topology optimisation is addressed. Topology optimisation involves optimally distributing a given volume of material in a design region such that a performance measure is maximised. Although extensive studies exist for the compliance minimisation problem domain, studies and comparisons on porous material problems are relatively rare. Firstly, a single objective absorption maximisation problem with a constraint on the weight is considered. Then a multi-objective problem of simultaneously maximising absorption and minimising weight is considered. The unique nature of the topology optimisation problem allows it to be solved using combinatorial or continuous, gradient or non-gradient methods. In this work, several optimisation methods are studied, including solid isotropic material with penalisation (SIMP), hill climbing, constructive heuristics, genetic algorithms, tabu search, co-variance matrix adaptation evolution strategy (CMA-ES), differential evolution, non-dominated sorting genetic algorithm (NSGA-II) and hybrid strategies. These approaches are tested on a benchmark of seven acoustics problem instances. The results are used to extract domain-specific insights. The findings highlight that the problem domain is rich with unique varieties of solutions, and by using domain-specific insights, one can design hybrid gradient and non-gradient methods that consistently outperform state-of-the-art ones
Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis
Factor analysis provides a canonical framework for imposing lower-dimensional
structure such as sparse covariance in high-dimensional data. High-dimensional
data on the same set of variables are often collected under different
conditions, for instance in reproducing studies across research groups. In such
cases, it is natural to seek to learn the shared versus condition-specific
structure. Existing hierarchical extensions of factor analysis have been
proposed, but face practical issues including identifiability problems. To
address these shortcomings, we propose a class of SUbspace Factor Analysis
(SUFA) models, which characterize variation across groups at the level of a
lower-dimensional subspace. We prove that the proposed class of SUFA models
lead to identifiability of the shared versus group-specific components of the
covariance, and study their posterior contraction properties. Taking a Bayesian
approach, these contributions are developed alongside efficient posterior
computation algorithms. Our sampler fully integrates out latent variables, is
easily parallelizable and has complexity that does not depend on sample size.
We illustrate the methods through application to integration of multiple gene
expression datasets relevant to immunology
Distributed Memory, GPU Accelerated Fock Construction for Hybrid, Gaussian Basis Density Functional Theory
With the growing reliance of modern supercomputers on accelerator-based
architectures such a GPUs, the development and optimization of electronic
structure methods to exploit these massively parallel resources has become a
recent priority. While significant strides have been made in the development of
GPU accelerated, distributed memory algorithms for many-body (e.g.
coupled-cluster) and spectral single-body (e.g. planewave, real-space and
finite-element density functional theory [DFT]), the vast majority of
GPU-accelerated Gaussian atomic orbital methods have focused on shared memory
systems with only a handful of examples pursuing massive parallelism on
distributed memory GPU architectures. In the present work, we present a set of
distributed memory algorithms for the evaluation of the Coulomb and
exact-exchange matrices for hybrid Kohn-Sham DFT with Gaussian basis sets via
direct density-fitted (DF-J-Engine) and seminumerical (sn-K) methods,
respectively. The absolute performance and strong scalability of the developed
methods are demonstrated on systems ranging from a few hundred to over one
thousand atoms using up to 128 NVIDIA A100 GPUs on the Perlmutter
supercomputer.Comment: 45 pages, 9 figure
- …