9,001 research outputs found
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
A Local-to-Global Theorem for Congested Shortest Paths
Amiri and Wargalla (2020) proved the following local-to-global theorem in
directed acyclic graphs (DAGs): if is a weighted DAG such that for each
subset of 3 nodes there is a shortest path containing every node in ,
then there exists a pair of nodes such that there is a shortest
-path containing every node in .
We extend this theorem to general graphs. For undirected graphs, we prove
that the same theorem holds (up to a difference in the constant 3). For
directed graphs, we provide a counterexample to the theorem (for any constant),
and prove a roundtrip analogue of the theorem which shows there exists a pair
of nodes such that every node in is contained in the union of a
shortest -path and a shortest -path.
The original theorem for DAGs has an application to the -Shortest Paths
with Congestion (()-SPC) problem. In this problem, we are given a
weighted graph , together with node pairs ,
and a positive integer . We are tasked with finding paths such that each is a shortest path from to , and every
node in the graph is on at most paths , or reporting that no such
collection of paths exists.
When the problem is easily solved by finding shortest paths for each
pair independently. When , the -SPC problem recovers
the -Disjoint Shortest Paths (-DSP) problem, where the collection of
shortest paths must be node-disjoint. For fixed , -DSP can be solved in
polynomial time on DAGs and undirected graphs. Previous work shows that the
local-to-global theorem for DAGs implies that -SPC on DAGs whenever
is constant. In the same way, our work implies that -SPC can be
solved in polynomial time on undirected graphs whenever is constant.Comment: Updated to reflect reviewer comment
Using machine learning to predict pathogenicity of genomic variants throughout the human genome
GeschĂ€tzt mehr als 6.000 Erkrankungen werden durch VerĂ€nderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das SpleiĂen der mRNA in eine andere Isoform begĂŒnstigen. All diese Prozesse mĂŒssen ĂŒberprĂŒft werden, um die zum beschriebenen PhĂ€notyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer PathogenitĂ€t.
Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier prĂ€sentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst auĂerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores.
Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells fĂŒr das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. AuĂerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-SpleiĂing verbessern. AuĂerdem werden Varianteneffektmodelle aufgrund eines neuen, auf AllelhĂ€ufigkeit basierten, Trainingsdatensatz entwickelt.
Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfĂŒgbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity.
Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants.
The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency.
In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org
Graph-based Algorithm Unfolding for Energy-aware Power Allocation in Wireless Networks
We develop a novel graph-based trainable framework to maximize the weighted
sum energy efficiency (WSEE) for power allocation in wireless communication
networks. To address the non-convex nature of the problem, the proposed method
consists of modular structures inspired by a classical iterative suboptimal
approach and enhanced with learnable components. More precisely, we propose a
deep unfolding of the successive concave approximation (SCA) method. In our
unfolded SCA (USCA) framework, the originally preset parameters are now
learnable via graph convolutional neural networks (GCNs) that directly exploit
multi-user channel state information as the underlying graph adjacency matrix.
We show the permutation equivariance of the proposed architecture, which is a
desirable property for models applied to wireless network data. The USCA
framework is trained through a stochastic gradient descent approach using a
progressive training strategy. The unsupervised loss is carefully devised to
feature the monotonic property of the objective under maximum power
constraints. Comprehensive numerical results demonstrate its generalizability
across different network topologies of varying size, density, and channel
distribution. Thorough comparisons illustrate the improved performance and
robustness of USCA over state-of-the-art benchmarks.Comment: Published in IEEE Transactions on Wireless Communication
Rank-based linkage I: triplet comparisons and oriented simplicial complexes
Rank-based linkage is a new tool for summarizing a collection of objects
according to their relationships. These objects are not mapped to vectors, and
``similarity'' between objects need be neither numerical nor symmetrical. All
an object needs to do is rank nearby objects by similarity to itself, using a
Comparator which is transitive, but need not be consistent with any metric on
the whole set. Call this a ranking system on . Rank-based linkage is applied
to the -nearest neighbor digraph derived from a ranking system. Computations
occur on a 2-dimensional abstract oriented simplicial complex whose faces are
among the points, edges, and triangles of the line graph of the undirected
-nearest neighbor graph on . In steps it builds an
edge-weighted linkage graph where
is called the in-sway between objects and . Take to be
the links whose in-sway is at least , and partition into components of
the graph , for varying . Rank-based linkage is a
functor from a category of out-ordered digraphs to a category of partitioned
sets, with the practical consequence that augmenting the set of objects in a
rank-respectful way gives a fresh clustering which does not ``rip apart`` the
previous one. The same holds for single linkage clustering in the metric space
context, but not for typical optimization-based methods. Open combinatorial
problems are presented in the last section.Comment: 37 pages, 12 figure
cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values
Sparse graphical models have revolutionized multivariate inference. With the advent of high-dimensional multivariate data in many applied fields, these methods are able to detect a much lower-dimensional structure, often represented via a sparse conditional independence graph. There have been numerous extensions of such methods in the past decade. Many practical applications have additional covariates or suffer from missing or censored data. Despite the development of these extensions of sparse inference methods for graphical models, there have been so far no implementations for, e.g., conditional graphical models. Here we present the general-purpose package cglasso for estimating sparse conditional Gaussian graphical models with potentially missing or censored data. The method employs an efficient expectation-maximization estimation of an â1 -penalized likelihood via a block-coordinate descent algorithm. The package has a user-friendly data manipulation interface. It estimates a solution path and includes various automatic selection algorithms for the two â1 tuning parameters, associated with the sparse precision matrix and sparse regression coefficients, respectively. The package pays particular attention to the visualization of the results, both by means of marginal tables and figures, and of the inferred conditional independence graphs. This package provides a unique and computational efficient implementation of a conditional Gaussian graphical model that is able to deal with the additional complications of missing and censored data. As such it constitutes an important contribution for empirical scientists wishing to detect sparse structures in high-dimensional data
Dielectric-barrier discharge plasma actuators for turbulent friction-drag manipulation via spanwise oscillations
Ein Plasmaaktuator wird ĂŒber instationĂ€re Betriebsmodi angesteuert, um wandnahe
Fluidoszillationen zu erzeugen. Das Ziel ist es, spannweitig oszillierende
WĂ€nde zugunsten einer Verringerung des turbulenten Reibungswiderstands
nachzuahmen. Da der Aktuator keine beweglichen Teile besitzt, könnte er
sich als nicht-mechanischer Ersatz der oszillierenden Wand eignen. Die
Kombination von Betriebsmodus und zugrundeliegender Elektrodenanordnung
ist eine Neuerung, welche die spannweitige HomogenitÀt der Strömung
solcher virtuellen Wandoszillationen verbessert. Die mechanische Charakterisierung
wird mittels eines planaren Feldmessverfahrens durchgefĂŒhrt, um
sowohl die induzierten Strömungstopologien als auch die Effekte von Volumenkraft
und âvirtueller Wandgeschwindigkeitâ, d.h. Reaktion des Fluids,
aufzuzeigen. Daraus wird zur Bewertung und Optimierung der LeistungsfÀhigkeit
des Aktuators ein universelles Diagramm hinsichtlich aktuatorspezifischer
Parameter abgeleitet. Da die berechnete Volumenkraft die Art der
KraftausĂŒbung gut widerspiegelt, kann diese modellhaft zu verbesserten numerischen
Simulationen der Aktuatorik dienen. Ferner wird eine neue Vorgehensweise
fĂŒr die Bestimmung der elektrischen Leistung von Aktuatoren mit
mehreren Hochspannungselektroden bereitgestellt, welche die potenzielle AbschÀtzung
des Nettogewinns in aktiven Kontrollszenarien ermöglicht. Zuletzt
wird die unmittelbare Auswirkung der oszillatorischen KraftausĂŒbung auf den
Reibungswiderstand in der Querebene einer voll entwickelten turbulenten
Kanalströmung mittels einer stereoskopischen Feldmesstechnik untersucht.
Im Wesentlichen verbleibt die Strömung im sich entwickelnden Stadium und
erfÀhrt auf dem Aktuator eine Erhöhung des Reibungswiderstands, wÀhrend
sich dieser stromab des Aktuators verringert
Limit theorems for non-Markovian and fractional processes
This thesis examines various non-Markovian and fractional processes---rough volatility models, stochastic Volterra equations, Wiener chaos expansions---through the prism of asymptotic analysis.
Stochastic Volterra systems serve as a conducive framework encompassing most rough volatility models used in mathematical finance. In Chapter 2, we provide a unified treatment of pathwise large and moderate deviations principles for a general class of multidimensional stochastic Volterra equations with singular kernels, not necessarily of convolution form. Our methodology is based on the weak convergence approach by Budhiraja, Dupuis and Ellis.
This powerful approach also enables us to investigate the pathwise large deviations of families of white noise functionals characterised by their Wiener chaos expansion as~
In Chapter 3, we provide sufficient conditions for the large deviations principle to hold in path space, thereby refreshing a problem left open By PĂ©rez-Abreu (1993). Hinging on analysis on Wiener space, the proof involves describing, controlling and identifying the limit of perturbed multiple stochastic integrals.
In Chapter 4, we come back to mathematical finance via the route of Malliavin calculus. We present explicit small-time formulae for the at-the-money implied volatility, skew and curvature in a large class of models, including rough volatility models and their multi-factor versions. Our general setup encompasses both European options on a stock and VIX options. In particular, we develop a detailed analysis of the two-factor rough Bergomi model.
Finally, in Chapter 5, we consider the large-time behaviour of affine stochastic Volterra equations, an under-developed area in the absence of Markovianity.
We leverage on a measure-valued Markovian lift introduced by Cuchiero and Teichmann and the associated notion of generalised Feller property.
This setting allows us to prove the existence of an invariant measure for the lift and hence of a stationary distribution for the affine Volterra process, featuring in the rough Heston model.Open Acces
Identification of patterns for space-time event networks
This paper provides new tools for analyzing spatio-temporal event networks. We build time series of directed event networks for a set of spatial distances, and based on scan-statistics, the spatial distance that generates the strongest change of event network connections is chosen. In addition, we propose an empirical random network event generator to detect significant motifs throughout time. This generator preserves the spatial configuration but randomizes the order of the occurrence of events. To prevent the large number of links from masking the count of motifs, we propose using standardized counts of motifs at each time slot. Our methodology is able to detect interaction radius in space, build time series of networks, and describe changes in its topology over time, by means of identification of different types of motifs that allows for the understanding of the spatio-temporal dynamics of the phenomena. We illustrate our methodology by analyzing thefts occurred in MedellĂn (Colombia) between the years 2003 and 2015.Work supported by Red de Violencia y Criminalidad - Universidad Nacional Abierta y a Distancia UNAD, BogotĂĄ Colombia and Universidad Nacional de Colombia sede BogotĂĄ
- âŠ