151 research outputs found

    Flexible estimation of temporal point processes and graphs

    Get PDF
    Handling complex data types with spatial structures, temporal dependencies, or discrete values, is generally a challenge in statistics and machine learning. In the recent years, there has been an increasing need of methodological and theoretical work to analyse non-standard data types, for instance, data collected on protein structures, genes interactions, social networks or physical sensors. In this thesis, I will propose a methodology and provide theoretical guarantees for analysing two general types of discrete data emerging from interactive phenomena, namely temporal point processes and graphs. On the one hand, temporal point processes are stochastic processes used to model event data, i.e., data that comes as discrete points in time or space where some phenomenon occurs. Some of the most successful applications of these discrete processes include online messages, financial transactions, earthquake strikes, and neuronal spikes. The popularity of these processes notably comes from their ability to model unobserved interactions and dependencies between temporally and spatially distant events. However, statistical methods for point processes generally rely on estimating a latent, unobserved, stochastic intensity process. In this context, designing flexible models and consistent estimation methods is often a challenging task. On the other hand, graphs are structures made of nodes (or agents) and edges (or links), where an edge represents an interaction or relationship between two nodes. Graphs are ubiquitous to model real-world social, transport, and mobility networks, where edges can correspond to virtual exchanges, physical connections between places, or migrations across geographical areas. Besides, graphs are used to represent correlations and lead-lag relationships between time series, and local dependence between random objects. Graphs are typical examples of non-Euclidean data, where adequate distance measures, similarity functions, and generative models need to be formalised. In the deep learning community, graphs have become particularly popular within the field of geometric deep learning. Structure and dependence can both be modelled by temporal point processes and graphs, although predominantly, the former act on the temporal domain while the latter conceptualise spatial interactions. Nonetheless, some statistical models combine graphs and point processes in order to account for both spatial and temporal dependencies. For instance, temporal point processes have been used to model the birth times of edges and nodes in temporal graphs. Moreover, some multivariate point processes models have a latent graph parameter governing the pairwise causal relationships between the components of the process. In this thesis, I will notably study such a model, called the Hawkes model, as well as graphs evolving in time. This thesis aims at designing inference methods that provide flexibility in the contexts of temporal point processes and graphs. This manuscript is presented in an integrated format, with four main chapters and two appendices. Chapters 2 and 3 are dedicated to the study of Bayesian nonparametric inference methods in the generalised Hawkes point process model. While Chapter 2 provides theoretical guarantees for existing methods, Chapter 3 also proposes, analyses, and evaluates a novel variational Bayes methodology. The other main chapters introduce and study model-free inference approaches for two estimation problems on graphs, namely spectral methods for the signed graph clustering problem in Chapter 4, and a deep learning algorithm for the network change point detection task on temporal graphs in Chapter 5. Additionally, Chapter 1 provides an introduction and background preliminaries on point processes and graphs. Chapter 6 concludes this thesis with a summary and critical thinking on the works in this manuscript, and proposals for future research. Finally, the appendices contain two supplementary papers. The first one, in Appendix A, initiated after the COVID-19 outbreak in March 2020, is an application of a discrete-time Hawkes model to COVID-related deaths counts during the first wave of the pandemic. The second work, in Appendix B, was conducted during an internship at Amazon Research in 2021, and proposes an explainability method for anomaly detection models acting on multivariate time series

    Integral estimation based on Markovian design

    Get PDF
    Suppose that a mobile sensor describes a Markovian trajectory in the ambient space. At each time the sensor measures an attribute of interest, e.g., the temperature. Using only the location history of the sensor and the associated measurements, the aim is to estimate the average value of the attribute over the space. In contrast to classical probabilistic integration methods, e.g., Monte Carlo, the proposed approach does not require any knowledge on the distribution of the sensor trajectory. Probabilistic bounds on the convergence rates of the estimator are established. These rates are better than the traditional "root n"-rate, where n is the sample size, attached to other probabilistic integration methods. For finite sample sizes, the good behaviour of the procedure is demonstrated through simulations and an application to the evaluation of the average temperature of oceans is considered.Comment: 45 page

    Nonparametric estimation of the volatility under microstructure noise: wavelet adaptation

    Get PDF
    We study nonparametric estimation of the volatility function of a diffusion process from discrete data, when the data are blurred by additional noise. This noise can be white or correlated, and serves as a model for microstructure effects in financial modeling, when the data are given on an intra-day scale. By developing pre-averaging techniques combined with wavelet thresholding, we construct adaptive estimators that achieve a nearly optimal rate within a large scale of smoothness constraints of Besov type. Since the underlying signal (the volatility) is genuinely random, we propose a new criterion to assess the quality of estimation; we retrieve the usual minimax theory when this approach is restricted to deterministic volatility.Adaptive estimation; diffusion processes; high-frequency data; microstructure noise; minimax estimation; semimartingales; wavelets.

    Novel immunological interactions as an overlooked aspect of global change: insights from the host range expansion of Lycaeides melissa

    Get PDF
    There is accumulating evidence that insect populations globally are experiencing alarming declines. The combination of factors cumulatively exacerbating global insect declines has been referred to as “death by a thousand cuts”. One of these “cuts” is introduced species, however, recent recommendation articles have failed to address the issue of non-native plant species and their role in declining insect populations. The purpose of my dissertation was to investigate another potential indirect effect of non-native plant species on native insect fauna: immunological consequences. I chose this focus because although there is vast evidence that nutritional, phytochemical, and microbial variation can impact the insect immune system, these questions have been understudied in the context of novel host plant use. Given that non-native host plants often contain novel traits, such as secondary metabolites, these introductions effectively represent natural experiments which are excellent opportunities to test ecological immunology theory.To address these questions, I conducted four experimental projects on wild populations of the Melissa blue butterfly (Lycaeides melissa) in the Great Basin Desert. This plant-feeding butterfly is of interest from an evolutionary ecology standpoint because it has recently undergone a host expansion; it has incorporated a novel host plant into its diet. I have used this host expansion as a comparative framework to understand how ecological components can change the immune response; the novel host plant represents a novel nutritional, chemical, and microbial resource, all factors that can potentially impact the insect immune response. In my first chapter, I tested whether host plant use directly affected the insect immune response and whether host plant associated traits such as phytochemistry, microbes, or foliar protein had direct or indirect effects on insect immunity. For my second chapter, I investigated the role of maternal microbes in mediating the larval immune response of L. melissa. For my third chapter, I investigated how novel host plant use affects the transcriptional regulation of L. melissa genes when infected with the lepidopteran virus, Junonia coenia densovirus (JcDV). Finally, for my fourth chapter I tested whether novel host use impacted the ability of L. melissa to resist a viral pathogen, Junonia coenia densovirus (JcDV).Taken together, the results from my dissertation work suggest that novel species interactions (between native insect fauna and non-native plants) have immunological consequences. While use of the native host plant A. canadensis did not always result in a immune response, my final chapter revealed strong evidence for this host plant increasing survivorship in the presence of a live viral pathogen. Further, there is accumulating evidence from the literature that nutritionally superior host plants frequently result in a stronger immune response in lepidopterans

    Connes distance by examples: Homothetic spectral metric spaces

    Full text link
    We study metric properties stemming from the Connes spectral distance on three types of non compact noncommutative spaces which have received attention recently from various viewpoints in the physics literature. These are the noncommutative Moyal plane, a family of harmonic Moyal spectral triples for which the Dirac operator squares to the harmonic oscillator Hamiltonian and a family of spectral triples with Dirac operator related to the Landau operator. We show that these triples are homothetic spectral metric spaces, having an infinite number of distinct pathwise connected components. The homothetic factors linking the distances are related to determinants of effective Clifford metrics. We obtain as a by product new examples of explicit spectral distance formulas. The results are discussed.Comment: 23 pages. Misprints corrected, references updated, one remark added at the end of the section 3. To appear in Review in Mathematical Physic

    The Batyrev-Manin conjecture for DM stacks

    Full text link
    We define a new height function on rational points of a DM (Deligne-Mumford) stack over a number field. This generalizes a generalized discriminant of Ellenberg-Venkatesh, the height function recently introduced by Ellenberg-Satriano-Zureick-Brown (as far as DM stacks over number fields are concerned), and the quasi-toric height function on weighted projective stacks by Darda. Generalizing the Manin conjecture and the more general Batyrev-Manin conjecture, we formulate a few conjectures on the asymptotic behavior of the number of rational points of a DM stack with bounded height. To formulate the Batyrev-Manin conjecture for DM stacks, we introduce the orbifold versions of the so-called aa- and bb-invariants. When applied to the classifying stack of a finite group, these conjectures specialize to the Malle conjecture, except that we remove certain thin subsets from counting. More precisely, we remove breaking thin subsets, which have been studied in the case of varieties by people including Hassett, Tschinkel, Tanimoto, Lehmann and Sengupta, and can be generalized to DM stack thanks to our generalization of aa- and bb-invariants. The breaking thin subset enables us to reinterpret Kl\"uners' counterexample to the Malle conjecture.Comment: v2: minor changes, 50 page
    corecore