216 research outputs found
Ciguatoxins
Ciguatoxins (CTXs), which are responsible for Ciguatera fish poisoning (CFP), are liposoluble toxins produced by microalgae of the genera Gambierdiscus and Fukuyoa. This book presents 18 scientific papers that offer new information and scientific evidence on: (i) CTX occurrence in aquatic environments, with an emphasis on edible aquatic organisms; (ii) analysis methods for the determination of CTXs; (iii) advances in research on CTX-producing organisms; (iv) environmental factors involved in the presence of CTXs; and (v) the assessment of public health risks related to the presence of CTXs, as well as risk management and mitigation strategies
Performance of Regularization for Sparse Convex Optimization
Despite widespread adoption in practice, guarantees for the LASSO and Group
LASSO are strikingly lacking in settings beyond statistical problems, and these
algorithms are usually considered to be a heuristic in the context of sparse
convex optimization on deterministic inputs. We give the first recovery
guarantees for the Group LASSO for sparse convex optimization with
vector-valued features. We show that if a sufficiently large Group LASSO
regularization is applied when minimizing a strictly convex function , then
the minimizer is a sparse vector supported on vector-valued features with the
largest norm of the gradient. Thus, repeating this procedure selects
the same set of features as the Orthogonal Matching Pursuit algorithm, which
admits recovery guarantees for any function with restricted strong
convexity and smoothness via weak submodularity arguments. This answers open
questions of Tibshirani et al. and Yasuda et al. Our result is the first to
theoretically explain the empirical success of the Group LASSO for convex
functions under general input instances assuming only restricted strong
convexity and smoothness. Our result also generalizes provable guarantees for
the Sequential Attention algorithm, which is a feature selection algorithm
inspired by the attention mechanism proposed by Yasuda et al.
As an application of our result, we give new results for the column subset
selection problem, which is well-studied when the loss is the Frobenius norm or
other entrywise matrix losses. We give the first result for general loss
functions for this problem that requires only restricted strong convexity and
smoothness
Exclusive Group Lasso for Structured Variable Selection
A structured variable selection problem is considered in which the
covariates, divided into predefined groups, activate according to sparse
patterns with few nonzero entries per group. Capitalizing on the concept of
atomic norm, a composite norm can be properly designed to promote such
exclusive group sparsity patterns. The resulting norm lends itself to efficient
and flexible regularized optimization algorithms for support recovery, like the
proximal algorithm. Moreover, an active set algorithm is proposed that builds
the solution by successively including structure atoms into the estimated
support. It is also shown that such an algorithm can be tailored to match more
rigid structures than plain exclusive group sparsity. Asymptotic consistency
analysis (with both the number of parameters as well as the number of groups
growing with the observation size) establishes the effectiveness of the
proposed solution in terms of signed support recovery under conventional
assumptions. Finally, a set of numerical simulations further corroborates the
results.Comment: 36 pages, 2 figures. Not submitted for publication. New licens
Flexible estimation of temporal point processes and graphs
Handling complex data types with spatial structures, temporal dependencies, or discrete values, is generally a challenge in statistics and machine learning. In the recent years, there has been an increasing need of methodological and theoretical work to analyse non-standard data types, for instance, data collected on protein structures, genes interactions, social networks or physical sensors. In this thesis, I will propose a methodology and provide theoretical guarantees for analysing two general types of discrete data emerging from interactive phenomena, namely temporal point processes and graphs.
On the one hand, temporal point processes are stochastic processes used to model event data, i.e., data that comes as discrete points in time or space where some phenomenon occurs. Some of the most successful applications of these discrete processes include online messages, financial transactions, earthquake strikes, and neuronal spikes. The popularity of these processes notably comes from their ability to model unobserved interactions and dependencies between temporally and spatially distant events. However, statistical methods for point processes generally rely on estimating a latent, unobserved, stochastic intensity process. In this context, designing flexible models and consistent estimation methods is often a challenging task.
On the other hand, graphs are structures made of nodes (or agents) and edges (or links), where an edge represents an interaction or relationship between two nodes. Graphs are ubiquitous to model real-world social, transport, and mobility networks, where edges can correspond to virtual exchanges, physical connections between places, or migrations across geographical areas. Besides, graphs are used to represent correlations and lead-lag relationships between time series, and local dependence between random objects. Graphs are typical examples of non-Euclidean data, where adequate distance measures, similarity functions, and generative models need to be formalised. In the deep learning community, graphs have become particularly popular within the field of geometric deep learning.
Structure and dependence can both be modelled by temporal point processes and graphs, although predominantly, the former act on the temporal domain while the latter conceptualise spatial interactions. Nonetheless, some statistical models combine graphs and point processes in order to account for both spatial and temporal dependencies. For instance, temporal point processes have been used to model the birth times of edges and nodes in temporal graphs. Moreover, some multivariate point processes models have a latent graph parameter governing the pairwise causal relationships between the components of
the process. In this thesis, I will notably study such a model, called the Hawkes model, as well as graphs evolving in time.
This thesis aims at designing inference methods that provide flexibility in the contexts of temporal point processes and graphs. This manuscript is presented in an integrated format, with four main chapters and two appendices. Chapters 2 and 3 are dedicated to the study of Bayesian nonparametric inference methods in the generalised Hawkes point process model. While Chapter 2 provides theoretical guarantees for existing methods, Chapter 3 also proposes, analyses, and evaluates a novel variational Bayes methodology. The other main chapters introduce and study model-free inference approaches for two estimation problems on graphs, namely spectral methods for the signed graph clustering problem in Chapter 4, and a deep learning algorithm for the network change point detection task on temporal graphs in Chapter 5.
Additionally, Chapter 1 provides an introduction and background preliminaries on point processes and graphs. Chapter 6 concludes this thesis with a summary and critical thinking on the works in this manuscript, and proposals for future research. Finally, the appendices contain two supplementary papers. The first one, in Appendix A, initiated after the COVID-19 outbreak in March 2020, is an application of a discrete-time Hawkes model to COVID-related deaths counts during the first wave of the pandemic. The second work, in Appendix B, was conducted during an internship at Amazon Research in 2021, and proposes an explainability method for anomaly detection models acting on multivariate time series
Recommended from our members
Examining university student satisfaction and barriers to taking online remote exams
Recent years have seen a surge in the popularity of online exams at universities, due to the greater convenience and flexibility they offer both students and institutions. Driven by the dearth of empirical data on distance learning students' satisfaction levels and the difficulties they face when taking online exams, a survey with 562 students at The Open University (UK) was conducted to gain insights into their experiences with this type of exam. Satisfaction was reported with the environment and exams, while work commitments and technical difficulties presented the greatest barriers. Gender, race and disability were also associated with different levels of satisfaction and barriers. This study adds to the increasing number of studies into online exams, demonstrating how this type of exam can still have a substantial effect on students experienced in online learning systems and
technologies
Solar irradiance forecast from all-sky images using machine learning
The novel method presented here comprises techniques for cloud coverage percentage forecasts,
cloud movement forecast and the subsequently prediction of the global horizontal irradiance
(GHI) using all-sky images and Machine Learning techniques. Such models are employed to
forecast GHI, which is necessary to make more accurate time series forecasts for photovoltaic
systems like “island solutions” for power production or for energy exchange like in virtual power
plants. All images were recorded by a hemispheric sky imager (HSI) at the Institute of Meteo rology and Climatology (IMuK) of the Leibniz University Hannover, Hannover, Germany.
This thesis is composed of three parts. First, a model to forecast the total cloud cover
five-minutes ahead by training an autoregressive neural network with Backpropagation. The
prediction results showed a reduction of both the Root Mean Square Error (RMSE) and Mean
Absolute Error (MAE) by approximately 30% compared to the reference solar persistence solar
model for various cloud conditions. Second, a model to predict the GHI up to one-hour ahead by
training a Levenberg Marquardt Backpropagation neural network. This novel method reduced
both the RMSE and the MAE of the one-hour prediction by approximately 40% under various
weather conditions. Third, for the forecasting of the cloud movement up to two-minutes ahead, a
high-resolution Deep Learning method using convolutional neural networks (CNN) was created.
By taking real cloud shapes produced by the correction of the hazy areas considering the green
signal counts pixels, predicted clouds shapes of the proposed algorithm was compared with the
persistence solar model using the Sørensen-Dice similarity coefficient (SDC). The results of the
proposed method have shown a mean SDC of 94 ± 2.6% (mean ± standard deviation) for the
first minutes outperforming the persistence solar model with a SDC of 89 ± 3.8%. Thus, the
proposed method may represent cloud shapes better than the persistence solar model. Finally,
the Bonferroni's correction was performed so that the significance level of 0.05 was corrected
to 0.05, and thus, the difference between the SDC of the proposed method and the persistence
solar model was p = 0.001 being significantly high.
The proposed methodologies may have broad application in the planning and management of
PV power production allowing more accurate forecasts of the GHI minutes ahead by targeting
primary and secondary energy control reserve
Machine learning approaches to genome-wide association studies
Genome-wide Association Studies (GWAS) are conducted to identify single nucleotide polymorphisms
(variants) associated with a phenotype within a specific population. These variants associated with diseases
have a complex molecular aetiology with which they cause the disease phenotype. The genotyping
data generated from subjects of study is of high dimensionality, which is a challenge. The problem is that
the dataset has a large number of features and a relatively smaller sample size. However, statistical testing
is the standard approach being applied to identify these variants that influence the phenotype of
interest. The wide applications and abilities of Machine Learning (ML) algorithms promise to understand
the effects of these variants better. The aim of this work is to discuss the applications and future trends of
ML algorithms in GWAS towards understanding the effects of population genetic variant. It was discovered
that algorithms such as classification, regression, ensemble, and neural networks have been applied
to GWAS for which this work has further discussed comprehensively including their application areas.
The ML algorithms have been applied to the identification of significant single nucleotide polymorphisms
(SNP), disease risk assessment & prediction, detection of epistatic non-linear interaction, and integrated
with other omics sets. This comprehensive review has highlighted these areas of application and sheds
light on the promise of innovating machine learning algorithms into the computational and statistical
pipeline of genome-wide association studies. This will be beneficial for better understanding of how variants
are affected by disease biology and how the same variants can influence risk by developing a particular
phenotype for favourable natural selection
Understanding Deep Learning Optimization via Benchmarking and Debugging
Das zentrale Prinzip des maschinellen Lernens (ML) ist die Vorstellung, dass Computer die notwendigen Strategien zur Lösung einer Aufgabe erlernen können, ohne explizit dafür programmiert worden zu sein. Die Hoffnung ist, dass Computer anhand von Daten die zugrunde liegenden Muster erkennen und selbst feststellen, wie sie Aufgaben erledigen können, ohne dass sie dabei von Menschen geleitet werden müssen. Um diese Aufgabe zu erfüllen, werden viele Probleme des maschinellen Lernens als Minimierung einer Verlustfunktion formuliert. Daher sind Optimierungsverfahren ein zentraler Bestandteil des Trainings von ML-Modellen. Obwohl das maschinelle Lernen und insbesondere das tiefe Lernen oft als innovative Spitzentechnologie wahrgenommen wird, basieren viele der zugrunde liegenden Optimierungsalgorithmen eher auf simplen, fast archaischen Verfahren. Um moderne neuronale Netze erfolgreich zu trainieren, bedarf es daher häufig umfangreicher menschlicher Unterstützung. Ein Grund für diesen mühsamen, umständlichen und langwierigen Trainingsprozess ist unser mangelndes Verständnis der Optimierungsmethoden im anspruchsvollen Rahmen des tiefen Lernens. Auch deshalb hat das Training neuronaler Netze bis heute den Ruf, eher eine Kunstform als eine echte Wissenschaft zu sein und erfordert ein Maß an menschlicher Beteiligung, welche dem Kernprinzip des maschinellen Lernens widerspricht. Obwohl bereits Hunderte Optimierungsverfahren für das tiefe Lernen vorgeschlagen wurden, gibt es noch kein allgemein anerkanntes Protokoll zur Beurteilung ihrer Qualität. Ohne ein standardisiertes und unabhängiges Bewertungsprotokoll ist es jedoch schwierig, die Nützlichkeit neuartiger Methoden zuverlässig nachzuweisen. In dieser Arbeit werden Strategien vorgestellt, mit denen sich Optimierer für das tiefe Lernen quantitativ, reproduzierbar und aussagekräftig vergleichen lassen. Dieses Protokoll berücksichtigt die einzigartigen Herausforderungen des tiefen Lernens, wie etwa die inhärente Stochastizität oder die wichtige Unterscheidung zwischen Lernen und reiner Optimierung. Die Erkenntnisse sind im Python-Paket DeepOBS formalisiert und automatisiert, wodurch gerechtere, schnellere und überzeugendere empirische Vergleiche von Optimierern ermöglicht werden. Auf der Grundlage dieses Benchmarking-Protokolls werden anschließend fünfzehn populäre Deep-Learning-Optimierer verglichen, um einen Überblick über den aktuellen Entwicklungsstand in diesem Bereich zu gewinnen. Um fundierte Entscheidungshilfen für die Auswahl einer Optimierungsmethode aus der wachsenden Liste zu erhalten, evaluiert der Benchmark sie umfassend anhand von fast 50 000 Trainingsprozessen. Unser Benchmark zeigt, dass der vergleichsweise traditionelle Adam-Optimierer eine gute, aber nicht dominierende Methode ist und dass neuere Algorithmen ihn nicht kontinuierlich übertreffen können. Neben dem verwendeten Optimierer können auch andere Ursachen das Training neuronaler Netze erschweren, etwa ineffiziente Modellarchitekturen oder Hyperparameter. Herkömmliche Leistungsindikatoren, wie etwa die Verlustfunktion auf den Trainingsdaten oder die erreichte Genauigkeit auf einem separaten Validierungsdatensatz, können zwar zeigen, ob das Modell lernt oder nicht, aber nicht warum. Um dieses Verständnis und gleichzeitig einen Blick in die Blackbox der neuronalen Netze zu liefern, wird in dieser Arbeit Cockpit präsentiert, ein Debugging-Tool speziell für das tiefe Lernen. Es kombiniert neuartige und bewährte Observablen zu einem Echtzeit-Überwachungswerkzeug für das Training neuronaler Netze. Cockpit macht unter anderem deutlich,dass gut getunte Trainingsprozesse konsequent über das lokale Minimum hinausgehen, zumindest für wesentliche Phasen des Trainings. Der Einsatz von sorgfältigen Benchmarking-Experimenten und maßgeschneiderten Debugging-Tools verbessert unser Verständnis des Trainings neuronaler Netze. Angesichts des Mangels an theoretischen Erkenntnissen sind diese empirischen Ergebnisse und praktischen Instrumente unerlässlich für die Unterstützung in der Praxis. Vor allem aber zeigen sie auf, dass es einen Bedarf und einen klaren Weg für grundlegend neuartigen Optimierungsmethoden gibt, um das tiefe Lernen zugänglicher, robuster und ressourcenschonender zu machen.The central paradigm of machine learning (ML) is the idea that computers can learn the strategies needed to solve a task without being explicitly programmed to do so. The hope is that given data, computers can recognize underlying patterns and figure out how to perform tasks without extensive human oversight. To achieve this, many machine learning problems are framed as minimizing a loss function, which makes optimization methods a core part of training ML models. Machine learning and in particular deep learning is often perceived as a cutting-edge technology, the underlying optimization algorithms, however, tend to resemble rather simplistic, even archaic methods. Crucially, they rely on extensive human intervention to successfully train modern neural networks. One reason for this tedious, finicky, and lengthy training process lies in our insufficient understanding of optimization methods in the challenging deep learning setting. As a result, training neural nets, to this day, has the reputation of being more of an art form than a science and requires a level of human assistance that runs counter to the core principle of ML. Although hundreds of optimization algorithms for deep learning have been proposed, there is no widely agreed-upon protocol for evaluating their performance. Without a standardized and independent evaluation protocol, it is difficult to reliably demonstrate the usefulness of novel methods. In this thesis, we present strategies for quantitatively and reproducibly comparing deep learning optimizers in a meaningful way. This protocol considers the unique challenges of deep learning such as the inherent stochasticity or the crucial distinction between learning and pure optimization. It is formalized and automatized in the Python package DeepOBS and allows fairer, faster, and more convincing empirical comparisons of deep learning optimizers. Based on this benchmarking protocol, we compare fifteen popular deep learning optimizers to gain insight into the field’s current state. To provide evidence-backed heuristics for choosing among the growing list of optimization methods, we extensively evaluate them with roughly 50,000 training runs. Our benchmark indicates that the comparably traditional Adam optimizer remains a strong but not dominating contender and that newer methods fail to consistently outperform it. In addition to the optimizer, other causes can impede neural network training, such as inefficient model architectures or hyperparameters. Traditional performance metrics, such as training loss or validation accuracy, can show if a model is learning or not, but not why. To provide this understanding and a glimpse into the black box of neural networks, we developed Cockpit, a debugging tool specifically for deep learning. It combines novel and proven observables into a live monitoring tool for practitioners. Among other findings, Cockpit reveals that well-tuned training runs consistently overshoot the local minimum, at least for significant portions of the training. The use of thorough benchmarking experiments and tailored debugging tools improves our understanding of neural network training. In the absence of theoretical insights, these empirical results and practical tools are essential for guiding practitioners. More importantly, our results show that there is a need and a clear path for fundamentally different optimization methods to make deep learning more accessible, robust, and resource-efficient
- …