78 research outputs found

    Information Theoretic Structure Learning with Confidence

    Full text link
    Information theoretic measures (e.g. the Kullback Liebler divergence and Shannon mutual information) have been used for exploring possibly nonlinear multivariate dependencies in high dimension. If these dependencies are assumed to follow a Markov factor graph model, this exploration process is called structure discovery. For discrete-valued samples, estimates of the information divergence over the parametric class of multinomial models lead to structure discovery methods whose mean squared error achieves parametric convergence rates as the sample size grows. However, a naive application of this method to continuous nonparametric multivariate models converges much more slowly. In this paper we introduce a new method for nonparametric structure discovery that uses weighted ensemble divergence estimators that achieve parametric convergence rates and obey an asymptotic central limit theorem that facilitates hypothesis testing and other types of statistical validation.Comment: 10 pages, 3 figure

    Clustering with minimum spanning trees: How good can it be?

    Full text link
    Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they can be meaningful in data clustering tasks. By identifying the upper bounds for the agreement between the best (oracle) algorithm and the expert labels from a large battery of benchmark data, we discover that MST methods can overall be very competitive. Next, instead of proposing yet another algorithm that performs well on a limited set of examples, we review, study, extend, and generalise existing, the state-of-the-art MST-based partitioning schemes, which leads to a few new and interesting approaches. It turns out that the Genie method and the information-theoretic approaches often outperform the non-MST algorithms such as k-means, Gaussian mixtures, spectral clustering, BIRCH, and classical hierarchical agglomerative procedures

    Complexity and robustness of structures against extreme events

    Get PDF
    Civil structures are designed to support the loads acting on them. At present, the common practitioner considers both ordinary (winds, snow, accidental loads) and extreme events (earthquake, fire), combines the actions in such a way that, once the resistance of the elements is determined, the probability of failure is limited to a prescribed value. The set of events that may interest the structure is known and, therefore, a statistics of the actions is defined a priori. However, other events that are not forecastable may interest the construction. The sources of such events, called “Black Swans” after Taleb, are unknown, as well as their magnitude. For ensuring the integrity of the construction in such situations, which imply large damages, robust measures have to be taken (Chapter 3). Structural engineering is not the only domain in which unexpected events occur. Nature is the realm of contrasts. By means of evolution, living species differentiates, differentiated, in order to survive and reproduce. Various strategies were implemented in order to guarantee a biological robustness. Such mechanisms evoke one fundamental property of systems, the complexity and the connectivity between the components. The interaction between the parts makes the whole system more robust and tolerant to errors and damages (Chapters 1 and 2). Robustness in structures is implemented through classical strategies, which tend to limit the extent of damages through a design based on the consequences (Chapter 4). Being inspired by natural strategies, the idea of complexity in structural engineering is explored. Many issues arise, since a proper definition of this term has not been stated yet (Chapters 5 and 6). The ef- fects of element removal on frame structures, which represent an example of highly connected structural scheme, are investigated. As a result of simple simulations, the trend observed in Nature, which wants the complex systems to be robust to random damages, are spotted in the loaded structural schemes (Chapter 7)

    Disentangling ecological networks in marine microbes

    Get PDF
    There is a myriad of microorganisms on Earth contributing to global biogeochemical cycles, and their interactions are considered pivotal for ecosystem function. Previous studies have already determined relationships between a limited number of microorganisms. Yet, we still need to understand a large number of interactions to increase our knowledge of complex microbiomes. This is challenging because of the vast number of possible interactions. Thus, microbial interactions still remain barely known to date. Networks are a great tool to handle the vast number of microorganisms and their connections, explore potential microbial interactions, and elucidate patterns of microbial ecosystems. This thesis locates at the intersection of network inference and network analysis. The presented methodology aims to support and advance marine microbial investigations by reducing noise and elucidating patterns in inferred association networks for subsequent biological down-stream analyses. This thesis’s main contribution to marine microbial interactions studies is the development of the program EnDED (Environmentally-Driven Edge Detection), a computational framework to identify environmentally-driven associations inside microbial association networks, inferred from omics datasets. We applied the methodology to a model marine microbial ecosystem at the Blanes Bay Microbial Observatory (BBMO) in the North-Western Mediterranean Sea (ten years of monthly sampling). We also applied the methodology to a dataset compilation covering six global-ocean regions from the surface (3 m) to the deep ocean (down to 4539 m). Thus, our methodology provided a step towards studying the marine microbial distribution in space via the horizontal (ocean regions) and vertical (water column) axes.Hi ha una infinitat de microorganismes a la Terra que contribueixen als cicles biogeoquímics mundials i les seves interaccions es consideren fonamentals pel funcionament dels ecosistemes. Estudis previs ja han determinat les relacions entre un nombre limitat de microorganismes. Tot i això, encara hem d’entendre un gran nombre d’interaccions per augmentar el nostre coneixement dels microbiomes complexos. Això és un repte a causa del gran nombre d'interaccions possibles. Per això, les interaccions microbianes encara són poc conegudes fins ara. Les xarxes són una gran eina per tractar el gran nombre de microorganismes i les seves connexions, explorar interaccions microbianes potencials i dilucidar patrons d’ecosistemes microbians. Aquesta tesi es situa a la intersecció de la inferència de xarxes i l’anàlisi de la xarxes. La metodologia presentada té com a objectiu donar suport i avançar en investigacions microbianes marines reduint el soroll i dilucidant patrons en xarxes d’associació inferides per a posteriors anàlisis biològiques. La principal contribució d’aquesta tesi als estudis d’interaccions microbianes marines és el desenvolupament del programa EnDED (Environmentally-Driven Edge Detection), un marc computacional per identificar associacions impulsades pel medi ambient dins de xarxes d’associació microbiana, inferides a partir de conjunts de dades òmics. S’ha aplicat la metodologia a un model d’ecosistema microbià marí a l’Observatori Microbià de la Badia de Blanes (BBMO) al mar Mediterrani nord-occidental (deu anys de mostreig mensual). També s’ha la metodologia a una recopilació de dades que cobreix sis regions oceàniques globals des de la superfície (3 m) fins a l'oceà profund (fins a 4539 m).Hay una gran cantidad de microorganismos en la Tierra que contribuyen a los ciclos biogeoquímicos globales, y sus interacciones se consideran fundamentales para la función del ecosistema. Estudios previos ya han determinado relaciones entre un número limitado de microorganismos. Sin embargo, todavía necesitamos comprender una gran cantidad de interacciones para aumentar nuestro conocimiento de los microbiomas más complejos. Esto representa un gran desafío debido a la gran cantidad de posibles interacciones. Por lo tanto, las interacciones microbianas son aun poco conocidas. Las redes representan una gran herramienta para analizar la gran cantidad de microorganismos y sus conexiones, explorar posibles interacciones y dilucidar patrones en ecosistemas microbianos. Esta tesis se ubica en la intersección entre la inferencia de redes y el análisis de redes. La metodología presentada tiene como objetivo avanzar las investigaciones sobre interacciones microbianas marinas mediante la reducción del ruido en las inferencias de redes y elucidar patrones en redes de asociación permitiendo análisis biológicos posteriores. La principal contribución de esta tesis a los estudios de interacciones microbianas marinas es el desarrollo del programa EnDED (Environmentally-Driven Edge Detection), un marco computacional para identificar asociaciones generadas por el medio ambiente en redes de asociaciones microbianas, inferidas a partir de datos ómicos. Aplicamos la metodología a un modelo de ecosistema microbiano marino en el Observatorio Microbiano de la Bahía de Blanes (BBMO) en el Mar Mediterráneo Noroccidental (diez años de muestreo mensual). También, aplicamos la metodología a una compilación de conjuntos de datos que cubren seis regiones oceánicas globales desde la superficie (3 m) hasta las profundidades del océano (hasta 4539 m). Por lo tanto, nuestra metodología significa un paso adelante hacia de los patrones temporales microbianos marinos y el estudio de la distribución microbiana marina en el espacio a través de los ejes horizontal (regiones oceánicas) y vertical (columna de agua). Para llegar a hipótesis de interacción precisas, es importante determinar, cuantificar y eliminar las asociaciones generadas por el medio ambiente en las redes de asociaciones microbianas marinas. Además, nuestros resultados subrayaron la necesidad de estudiar la naturaleza dinámica de las redes, en contraste con el uso de redes estáticas únicas agregadas en el tiempo o el espacio. Nuestras nuevas metodologías pueden ser utilizadas por una amplia gama de investigadores que investigan redes e interacciones en diversos microbiomas.Postprint (published version

    Resilience of power grids and other supply networks: structural stability, cascading failures and optimal topologies

    Get PDF
    The consequences of the climate crisis are already present and can be expected to become more severe in the future. To mitigate long-term consequences, a major part of the world's countries has committed to limit the temperature rise via the Paris Agreement in the year 2015. To achieve this goal, the energy production needs to decarbonise, which results in fundamental changes in many societal aspects. In particular, the electrical power production is shifting from fossil fuels to renewable energy sources to limit greenhouse gas emissions. The electrical power transmission grid plays a crucial role in this transformation. Notably, the storage and long-distance transport of electrical power becomes increasingly important, since variable renewable energy sources (VRES) are subjected to external factors such as weather conditions and their power production is therefore regionally and temporally diverse. As a result, the transmission grid experiences higher loadings and bottlenecks appear. In a highly-loaded grid, a single transmission line or generator outage can trigger overloads on other components via flow rerouting. These may in turn trigger additional rerouting and overloads, until, finally, parts of the grid become disconnected. Such cascading failures can result in large-scale power blackouts, which bear enormous risks, as almost all infrastructures and economic activities depend on a reliable supply of electric power. Thus, it is essential to understand how networks react to local failures, how flow is rerouted after failures and how cascades emerge and spread in different power transmission grids to ensure a stable power grid operation. In this thesis, I examine how the network topology shapes the resilience of power grids and other supply networks. First, I analyse how flow is rerouted after the failure of a single or a few links and derive mathematically rigorous results on the decay of flow changes with different network-based distance measures. Furthermore, I demonstrate that the impact of single link failures follows a universal statistics throughout different topologies and introduce a stochastic model for cascading failures that incorporates crucial aspects of flow redistribution. Based on this improved understanding of link failures, I propose network modifications that attenuate or completely suppress the impact of link failures in parts of the network and thereby significantly reduce the risk of cascading failures. In a next step, I compare the topological characteristics of different kinds of supply networks to analyse how the trade-off between efficiency and resilience determines the structure of optimal supply networks. Finally, I examine what shapes the risk of incurring large scale cascading failures in a realistic power system model to assess the effects of the energy transition in Europe

    Structure-oriented prediction in complex networks

    Get PDF
    Complex systems are extremely hard to predict due to its highly nonlinear interactions and rich emergent properties. Thanks to the rapid development of network science, our understanding of the structure of real complex systems and the dynamics on them has been remarkably deepened, which meanwhile largely stimulates the growth of effective prediction approaches on these systems. In this article, we aim to review different network-related prediction problems, summarize and classify relevant prediction methods, analyze their advantages and disadvantages, and point out the forefront as well as critical challenges of the field

    Data Science: Measuring Uncertainties

    Get PDF
    With the increase in data processing and storage capacity, a large amount of data is available. Data without analysis does not have much value. Thus, the demand for data analysis is increasing daily, and the consequence is the appearance of a large number of jobs and published articles. Data science has emerged as a multidisciplinary field to support data-driven activities, integrating and developing ideas, methods, and processes to extract information from data. This includes methods built from different knowledge areas: Statistics, Computer Science, Mathematics, Physics, Information Science, and Engineering. This mixture of areas has given rise to what we call Data Science. New solutions to the new problems are reproducing rapidly to generate large volumes of data. Current and future challenges require greater care in creating new solutions that satisfy the rationality for each type of problem. Labels such as Big Data, Data Science, Machine Learning, Statistical Learning, and Artificial Intelligence are demanding more sophistication in the foundations and how they are being applied. This point highlights the importance of building the foundations of Data Science. This book is dedicated to solutions and discussions of measuring uncertainties in data analysis problems

    Statistical physics approaches to the complex Earth system

    Get PDF
    Global warming, extreme climate events, earthquakes and their accompanying socioeconomic disasters pose significant risks to humanity. Yet due to the nonlinear feedbacks, multiple interactions and complex structures of the Earth system, the understanding and, in particular, the prediction of such disruptive events represent formidable challenges to both scientific and policy communities. During the past years, the emergence and evolution of Earth system science has attracted much attention and produced new concepts and frameworks. Especially, novel statistical physics and complex networks-based techniques have been developed and implemented to substantially advance our knowledge of the Earth system, including climate extreme events, earthquakes and geological relief features, leading to substantially improved predictive performances. We present here a comprehensive review on the recent scientific progress in the development and application of how combined statistical physics and complex systems science approaches such as critical phenomena, network theory, percolation, tipping points analysis, and entropy can be applied to complex Earth systems. Notably, these integrating tools and approaches provide new insights and perspectives for understanding the dynamics of the Earth systems. The overall aim of this review is to offer readers the knowledge on how statistical physics concepts and theories can be useful in the field of Earth system science
    corecore