199 research outputs found
Estudio del efecto de la vacunación en modelos de epidemias con transmisión estocástica
Tesis inédita de la Universidad Complutense de Madrid, Facultad de Estudios Estadísticos, leída el 15-12-2022Mathematical epidemic models are frequently used in biology for analyzing transmission dynamics of infectious diseases and assessing control measures to interrupt their expansion. In order to select and develop properly the above mathematical models, it is necessary to take into account the particularities of an epidemic process as type of disease, mode of transmission and population characteristics. In this thesis we focus on infectious diseases with stochastic transmission including vaccination as a control measure to stop the spread of the pathogen. To that end, we consider constant and moderate size populations where individuals are homogeneously mixed. We assume that characteristics related to the transmission/recovery of the infectious disease present a common probabilistic behavior for individuals in the population. To assure herd immunity protection, we consider that a percentage of the population is protected against the disease by a vaccine, prior to the start of the outbreak.The administered vaccine is imperfect in the sense that some individuals, who have been previously vaccinated, failed to increase antibody levels and, in consequence, they could be infected. Pathogenic transmission occurs by direct contact with infected individuals. As population is not isolated, disease spreads from direct contacts with infected individuals inside or outside the population...Los modelos matemáticos epidemiológicos se usan frecuentemente en biología para analizar las dinámicas de transmisión de enfermedades infecciosas y para evaluar medidas de control con el objetivo de frenar su expansión. Para poder seleccionar y desarrollar adecuadamente estos modelos es necesario tener en cuenta las particularidades propias del proceso epidémico tales como el tipo de enfermedad, modo de transmisión y características de la población. En esta tesis nos centramos en el estudio de enfermedades de tipo infeccioso con transmisión por contacto directo, que disponen de una vacuna como medida de contención en la propagación del patógeno. Para ello, consideramos poblaciones de tamaño moderado, que permanece constante a lo largo de un brote y asumiremos que los individuos no tienen preferencia a la hora de relacionarse y que las características referentes a la transmisión de la enfermedad se representan en términos de variables aleatorias, comunes para todos los individuos. La población no está aislada y la transmisión del patógeno se produce mediante contacto directo con cualquier persona infectada, tanto de dentro de la población como fuera de ella. Asumimos que, antes del inicio del brote epidémico, se ha administrado la vacuna a un porcentaje suficiente de individuos de la población, de forma que se asegure la inmunidad de rebaño. Consideramos que la vacuna administrada es imperfecta en el sentido que algunos individuos vacunados no logran desarrollar anticuerpos frente a la enfermedad y por lo tanto, podrían resultar infectados al contactar con individuos enfermos...Fac. de Estudios EstadísticosTRUEunpu
CDOpt: A Python Package for a Class of Riemannian Optimization
Optimization over the embedded submanifold defined by constraints
has attracted much interest over the past few decades due to its wide
applications in various areas. Plenty of related optimization packages have
been developed based on Riemannian optimization approaches, which rely on some
basic geometrical materials of Riemannian manifolds, including retractions,
vector transports, etc. These geometrical materials can be challenging to
determine in general. Existing packages only accommodate a few well-known
manifolds whose geometrical materials are easily accessible. For other
manifolds which are not contained in these packages, the users have to develop
the geometric materials by themselves. In addition, it is not always tractable
to adopt advanced features from various state-of-the-art unconstrained
optimization solvers to Riemannian optimization approaches.
We introduce CDOpt (available at https://cdopt.github.io/), a user-friendly
Python package for a class Riemannian optimization. Based on constraint
dissolving approaches, Riemannian optimization problems are transformed into
their equivalent unconstrained counterparts in CDOpt. Therefore, solving
Riemannian optimization problems through CDOpt directly benefits from various
existing solvers and the rich expertise gained over decades for unconstrained
optimization. Moreover, all the computations in CDOpt related to any manifold
in question are conducted on its constraints expression, hence users can easily
define new manifolds in CDOpt without any background on differential geometry.
Furthermore, CDOpt extends the neural layers from PyTorch and Flax, thus allows
users to train manifold constrained neural networks directly by the solvers for
unconstrained optimization. Extensive numerical experiments demonstrate that
CDOpt is highly efficient and robust in solving various classes of Riemannian
optimization problems.Comment: 31 page
Structured parallelism discovery with hybrid static-dynamic analysis and evaluation technique
Parallel computer architectures have dominated the computing landscape for the
past two decades; a trend that is only expected to continue and intensify, with increasing specialization and heterogeneity. This creates huge pressure across the software
stack to produce programming languages, libraries, frameworks and tools which will
efficiently exploit the capabilities of parallel computers, not only for new software, but
also revitalizing existing sequential code. Automatic parallelization, despite decades of
research, has had limited success in transforming sequential software to take advantage
of efficient parallel execution. This thesis investigates three approaches that use commutativity analysis as the enabler for parallelization. This has the potential to overcome
limitations of traditional techniques.
We introduce the concept of liveness-based commutativity for sequential loops.
We examine the use of a practical analysis utilizing liveness-based commutativity in a
symbolic execution framework. Symbolic execution represents input values as groups
of constraints, consequently deriving the output as a function of the input and enabling
the identification of further program properties. We employ this feature to develop an
analysis and discern commutativity properties between loop iterations. We study the
application of this approach on loops taken from real-world programs in the OLDEN
and NAS Parallel Benchmark (NPB) suites, and identify its limitations and related
overheads.
Informed by these findings, we develop Dynamic Commutativity Analysis (DCA), a
new technique that leverages profiling information from program execution with specific
input sets. Using profiling information, we track liveness information and detect loop
commutativity by examining the code’s live-out values. We evaluate DCA against almost
1400 loops of the NPB suite, discovering 86% of them as parallelizable. Comparing
our results against dependence-based methods, we match the detection efficacy of two
dynamic and outperform three static approaches, respectively. Additionally, DCA is
able to automatically detect parallelism in loops which iterate over Pointer-Linked
Data Structures (PLDSs), taken from wide range of benchmarks used in the literature,
where all other techniques we considered failed. Parallelizing the discovered loops, our
methodology achieves an average speedup of 3.6× across NPB (and up to 55×) and up
to 36.9× for the PLDS-based loops on a 72-core host. We also demonstrate that our
methodology, despite relying on specific input values for profiling each program, is able
to correctly identify parallelism that is valid for all potential input sets.
Lastly, we develop a methodology to utilize liveness-based commutativity, as implemented in DCA, to detect latent loop parallelism in the shape of patterns. Our approach
applies a series of transformations which subsequently enable multiple applications
of DCA over the generated multi-loop code section and match its loop commutativity
outcomes against the expected criteria for each pattern. Applying our methodology on
sets of sequential loops, we are able to identify well-known parallel patterns (i.e., maps,
reduction and scans). This extends the scope of parallelism detection to loops, such
as those performing scan operations, which cannot be determined as parallelizable by
simply evaluating liveness-based commutativity conditions on their original form
Efficient GPU implementation of a Boltzmann‑Schrödinger‑Poisson solver for the simulation of nanoscale DG MOSFETs
81–102, 2019) describes an efficient and accurate solver for nanoscale DG MOSFETs
through a deterministic Boltzmann-Schrödinger-Poisson model with seven
electron–phonon scattering mechanisms on a hybrid parallel CPU/GPU platform.
The transport computational phase, i.e. the time integration of the Boltzmann equations,
was ported to the GPU using CUDA extensions, but the computation of the
system’s eigenstates, i.e. the solution of the Schrödinger-Poisson block, was parallelized
only using OpenMP due to its complexity. This work fills the gap by describing
a port to GPU for the solver of the Schrödinger-Poisson block. This new proposal
implements on GPU a Scheduled Relaxation Jacobi method to solve the sparse linear
systems which arise in the 2D Poisson equation. The 1D Schrödinger equation
is solved on GPU by adapting a multi-section iteration and the Newton-Raphson
algorithm to approximate the energy levels, and the Inverse Power Iterative Method
is used to approximate the wave vectors. We want to stress that this solver for the
Schrödinger-Poisson block can be thought as a module independent of the transport
phase (Boltzmann) and can be used for solvers using different levels of description
for the electrons; therefore, it is of particular interest because it can be adapted to
other macroscopic, hence faster, solvers for confined devices exploited at industrial
level.Project PID2020-117846GB-I00 funded by the Spanish Ministerio de Ciencia
e InnovaciónProject A-TIC-344-UGR20 funded by European
Regional Development Fund
Algebraic, Block and Multiplicative Preconditioners based on Fast Tridiagonal Solves on GPUs
This thesis contributes to the field of sparse linear algebra, graph applications, and preconditioners for Krylov iterative solvers of sparse linear equation systems, by providing a (block) tridiagonal solver library, a generalized sparse matrix-vector implementation, a linear forest extraction, and a multiplicative preconditioner based on tridiagonal solves. The tridiagonal library, which supports (scaled) partial pivoting, outperforms cuSPARSE's tridiagonal solver by factor five while completely utilizing the available GPU memory bandwidth. For the performance optimized solving of multiple right-hand sides, the explicit factorization of the tridiagonal matrix can be computed. The extraction of a weighted linear forest (union of disjoint paths) from a general graph is used to build algebraic (block) tridiagonal preconditioners and deploys the generalized sparse-matrix vector implementation of this thesis for preconditioner construction. During linear forest extraction, a new parallel bidirectional scan pattern, which can operate on double-linked list structures, identifies the path ID and the position of a vertex. The algebraic preconditioner construction is also used to build more advanced preconditioners, which contain multiple tridiagonal factors, based on generalized ILU factorizations. Additionally, other preconditioners based on tridiagonal factors are presented and evaluated in comparison to ILU and ILU incomplete sparse approximate inverse preconditioners (ILU-ISAI) for the solution of large sparse linear equation systems from the Sparse Matrix Collection. For all presented problems of this thesis, an efficient parallel algorithm and its CUDA implementation for single GPU systems is provided
High-performance SVD partial spectrum computation
We introduce a new singular value decomposition (SVD) solver
based on the QR-based Dynamically Weighted Halley (QDWH) algorithm for computing the partial spectrum SVD (QDWHpartial-SVD)
problems. By optimizing the rational function underlying the algorithms in the desired part of the spectrum only, the QDWHpartial-SVD
algorithm efficiently computes a fraction (say 1-20%) of the leading
singular values/vectors. We develop a high-performance implementation of QDWHpartial-SVD 1 on distributed-memory manycore
systems and demonstrate its numerical robustness. We perform a
benchmarking campaign against counterparts from the state-of-theart numerical libraries across various matrix sizes using up to 36K
MPI processes. Experimental results show performance speedups
for QDWHpartial-SVD up to 6X and 2X against vendor-optimized
PDGESVD from ScaLAPACK and KSVD on a Cray XC40 system
using 1152 nodes based on two-socket 16-core Intel Haswell CPU,
respectively. We also port our QDWHpartial-SVD software library
to a system composed of 256 nodes with two-socket 64-Core AMD
EPYC Milan CPU and achieve performance speedup up to 4X compared to vendor-optimized PDGESVD from ScaLAPACK. We also
compare energy consumption for the two algorithms and demonstrate how QDWHpartial-SVD can further outperform PDGESVD
in that regard by performing fewer memory-bound operations
Applied Methuerstic computing
For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC
Why diffusion-based preconditioning of Richards equation works: spectral analysis and computational experiments at very large scale
We consider here a cell-centered finite difference approximation of the
Richards equation in three dimensions, averaging for interface values the
hydraulic conductivity , a highly nonlinear function, by arithmetic,
upstream, and harmonic means. The nonlinearities in the equation can lead to
changes in soil conductivity over several orders of magnitude and
discretizations with respect to space variables often produce stiff systems of
differential equations. A fully implicit time discretization is provided by
\emph{backward Euler} one-step formula; the resulting nonlinear algebraic
system is solved by an inexact Newton Armijo-Goldstein algorithm, requiring the
solution of a sequence of linear systems involving Jacobian matrices. We prove
some new results concerning the distribution of the Jacobians eigenvalues and
the explicit expression of their entries. Moreover, we explore some connections
between the saturation of the soil and the ill-conditioning of the Jacobians.
The information on eigenvalues justifies the effectiveness of some
preconditioner approaches which are widely used in the solution of Richards
equation. We also propose a new software framework to experiment with scalable
and robust preconditioners suitable for efficient parallel simulations at very
large scales. Performance results on a literature test case show that our
framework is very promising in the advance towards realistic simulations at
extreme scale
- …