7,133 research outputs found
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
Beam scanning by liquid-crystal biasing in a modified SIW structure
A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium
Using machine learning to predict pathogenicity of genomic variants throughout the human genome
Geschätzt mehr als 6.000 Erkrankungen werden durch Veränderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begünstigen. All diese Prozesse müssen überprüft werden, um die zum beschriebenen Phänotyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer Pathogenität.
Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier präsentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores.
Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells für das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf Allelhäufigkeit basierten, Trainingsdatensatz entwickelt.
Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfügbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity.
Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants.
The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency.
In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org
Spectral methods for solving elliptic PDEs on unknown manifolds
In this paper, we propose a mesh-free numerical method for solving elliptic
PDEs on unknown manifolds, identified with randomly sampled point cloud data.
The PDE solver is formulated as a spectral method where the test function space
is the span of the leading eigenfunctions of the Laplacian operator, which are
approximated from the point cloud data. While the framework is flexible for any
test functional space, we will consider the eigensolutions of a weighted
Laplacian obtained from a symmetric Radial Basis Function (RBF) method induced
by a weak approximation of a weighted Laplacian on an appropriate Hilbert
space. Especially, we consider a test function space that encodes the geometry
of the data yet does not require us to identify and use the sampling density of
the point cloud. To attain a more accurate approximation of the expansion
coefficients, we adopt a second-order tangent space estimation method to
improve the RBF interpolation accuracy in estimating the tangential
derivatives. This spectral framework allows us to efficiently solve the PDE
many times subjected to different parameters, which reduces the computational
cost in the related inverse problem applications. In a well-posed elliptic PDE
setting with randomly sampled point cloud data, we provide a theoretical
analysis to demonstrate the convergent of the proposed solver as the sample
size increases. We also report some numerical studies that show the convergence
of the spectral solver on simple manifolds and unknown, rough surfaces. Our
numerical results suggest that the proposed method is more accurate than a
graph Laplacian-based solver on smooth manifolds. On rough manifolds, these two
approaches are comparable. Due to the flexibility of the framework, we
empirically found improved accuracies in both smoothed and unsmoothed Stanford
bunny domains by blending the graph Laplacian eigensolutions and RBF
interpolator.Comment: 8 figure
Network inference combining mutual information rate and statistical tests
In this paper, we present a method that combines information-theoretical and statistical approaches to infer connec- tivity in complex networks using time-series data. The method is based on estimations of the Mutual Information Rate for pairs of time-series and on statistical significance tests for connectivity acceptance using the false discovery rate method for multiple hypothesis testing. We provide the mathematical background on Mutual Information Rate, discuss the statistical significance tests and the false discovery rate. Further on, we present results for corre- lated normal-variates data, coupled circle and coupled logistic maps, coupled Lorenz systems and coupled stochastic Kuramoto phase oscillators. Following up, we study the effect of noise on the presented methodology in networks of coupled stochastic Kuramoto phase oscillators and of coupling heterogeneity degree on networks of coupled circle maps. We show that the method can infer the correct number and pairs of connected nodes, by means of receiver operating characteristic curves. In the more realistic case of stochastic data, we demonstrate its ability to infer the structure of the initial connectivity matrices. The method is also shown to recover the initial connectivity matrices for dynamics on the nodes of Erd ̋os-R ́enyi and small-world networks with varying coupling heterogeneity in their connections. The highlight of the proposed methodology is its ability to infer the underlying network connectivity based solely on the recorded datasets
Factorized Fusion Shrinkage for Dynamic Relational Data
Modern data science applications often involve complex relational data with
dynamic structures. An abrupt change in such dynamic relational data is
typically observed in systems that undergo regime changes due to interventions.
In such a case, we consider a factorized fusion shrinkage model in which all
decomposed factors are dynamically shrunk towards group-wise fusion structures,
where the shrinkage is obtained by applying global-local shrinkage priors to
the successive differences of the row vectors of the factorized matrices. The
proposed priors enjoy many favorable properties in comparison and clustering of
the estimated dynamic latent factors. Comparing estimated latent factors
involves both adjacent and long-term comparisons, with the time range of
comparison considered as a variable. Under certain conditions, we demonstrate
that the posterior distribution attains the minimax optimal rate up to
logarithmic factors. In terms of computation, we present a structured
mean-field variational inference framework that balances optimal posterior
inference with computational scalability, exploiting both the dependence among
components and across time. The framework can accommodate a wide variety of
models, including dynamic matrix factorization, latent space models for
networks and low-rank tensors. The effectiveness of our methodology is
demonstrated through extensive simulations and real-world data analysis
Learning strategies for improving neural networks for image segmentation under class imbalance
This thesis aims to improve convolutional neural networks (CNNs) for image segmentation under class imbalance, which is referred to the problem of training dataset when the class distributions are unequal. We particularly focus on medical image segmentation because of its imbalanced nature and clinical importance.
Based on our observations of model behaviour, we argue that CNNs cannot generalize well on imbalanced segmentation tasks, mainly because of two counterintuitive reasons. CNNs are prone to overfit the under-represented foreground classes as it would memorize the regions of interest (ROIs) in the training data because they are so rare. Besides, CNNs could underfit the heterogenous background classes as it is difficult to learn from the samples with diverse and complex characteristics. Those behaviours of CNNs are not limited to specific loss functions.
To address those limitations, firstly we propose novel asymmetric variants of popular loss functions and regularization techniques, which are explicitly designed to increase the variance of foreground samples to counter overfitting under class imbalance. Secondly we propose context label learning (CoLab) to tackle background underfitting by automatically decomposing the background class into several subclasses. This is achieved by optimizing an auxiliary task generator to generate context labels such that the main network will produce good ROIs segmentation performance. Then we propose a meta-learning based automatic data augmentation framework which builds a balance of foreground and background samples to alleviate class imbalance. Specifically, we learn class-specific training-time data augmentation (TRA) and jointly optimize TRA and test-time data augmentation (TEA) effectively aligning training and test data distribution for better generalization. Finally, we explore how to estimate model performance under domain shifts when trained with imbalanced dataset. We propose class-specific variants of existing confidence-based model evaluation methods which adapts separate parameters per class, enabling class-wise calibration to reduce model bias towards the minority classes.Open Acces
On the Principles of Evaluation for Natural Language Generation
Natural language processing is concerned with the ability of computers to understand natural language texts, which is, arguably, one of the major bottlenecks in the course of chasing the holy grail of general Artificial Intelligence. Given the unprecedented success of deep learning technology, the natural language processing community has been almost entirely in favor of practical applications with state-of-the-art systems emerging and competing for human-parity performance at an ever-increasing pace. For that reason, fair and adequate evaluation and comparison, responsible for ensuring trustworthy, reproducible and unbiased results, have fascinated the scientific community for long, not only in natural language but also in other fields. A popular example is the ISO-9126 evaluation standard for software products, which outlines a wide range of evaluation concerns, such as cost, reliability, scalability, security, and so forth. The European project EAGLES-1996, being the acclaimed extension to ISO-9126, depicted the fundamental principles specifically for evaluating natural language technologies, which underpins succeeding methodologies in the evaluation of natural language.
Natural language processing encompasses an enormous range of applications, each with its own evaluation concerns, criteria and measures. This thesis cannot hope to be comprehensive but particularly addresses the evaluation in natural language generation (NLG), which touches on, arguably, one of the most human-like natural language applications. In this context, research on quantifying day-to-day progress with evaluation metrics lays the foundation of the fast-growing NLG community. However, previous works have failed to address high-quality metrics in multiple scenarios such as evaluating long texts and when human references are not available, and, more prominently, these studies are limited in scope, given the lack of a holistic view sketched for principled NLG evaluation.
In this thesis, we aim for a holistic view of NLG evaluation from three complementary perspectives, driven by the evaluation principles in EAGLES-1996: (i) high-quality evaluation metrics, (ii) rigorous comparison of NLG systems for properly tracking the progress, and (iii) understanding evaluation metrics. To this end, we identify the current state of challenges derived from the inherent characteristics of these perspectives, and then present novel metrics, rigorous comparison approaches, and explainability techniques for metrics to address the identified issues.
We hope that our work on evaluation metrics, system comparison and explainability for metrics inspires more research towards principled NLG evaluation, and contributes to the fair and adequate evaluation and comparison in natural language processing
- …