44 research outputs found

    Deep Tree Models for 'Big' Biological Data

    Get PDF
    The identification of useful temporal dependence structure in discrete time series data is an important component of algorithms applied to many tasks in statistical inference and machine learning, and used in a wide variety of problems across the spectrum of biological studies. Most of the early statistical approaches were ineffective in practice, because the amount of data required for reliable modelling grew exponentially with memory length. On the other hand, many of the more modern methodological approaches that make use of more flexible and parsimonious models result in algorithms that do not scale well and are computationally ineffective for larger data sets. In this paper we describe a class of novel methodological tools for effective Bayesian inference for general discrete time series, motivated primarily by questions regarding data originating from studies in genetics and neuroscience. Our starting point is the development of a rich class of Bayesian hierarchical models for variable-memory Markov chains. The particular prior structure we adopt makes it possible to design effective, linear-time algorithms that can compute most of the important features of the relevant posterior and predictive distributions without resorting to Markov chain Monte Carlo simulation. The origin of some of these algorithms can be traced to the family of Context Tree Weighting (CTW) algorithms developed for data compression since the mid-1990s. We have used the resulting methodological tools in numerous application-specific tasks (including prediction, segmentation, classification, anomaly detection, entropy estimation, and causality testing) on data from different areas of application. The results obtained compare quite favourably with those obtained using earlier approaches, such as Probabilistic Suffix Trees (PST), Variable-Length Markov Chains (VLMC), and the class of Markov Transition Distributions (MTD)

    Large carnivore expansion in Europe is associated with human population density and land cover changes

    Get PDF
    Aim: The recent recovery of large carnivores in Europe has been explained as resulting from a decrease in human persecution driven by widespread rural land abandonment, paralleled by forest cover increase and the consequent increase in availability of shelter and prey. We investigated whether land cover and human population density changes are related to the relative probability of occurrence of three European large carnivores: the grey wolf (Canis lupus), the Eurasian lynx (Lynx lynx) and the brown bear (Ursus arctos). Location: Europe, west of 64° longitude. Methods: We fitted multi-temporal species distribution models using >50,000 occurrence points with time series of land cover, landscape configuration, protected areas, hunting regulations and human population density covering a 24-year period (1992–2015). Within the temporal window considered, we then predicted changes in habitat suitability for large carnivores throughout Europe. Results: Between 1992 and 2015, the habitat suitability for the three species increased in Eastern Europe, the Balkans, North-West Iberian Peninsula and Northern Scandinavia, but showed mixed trends in Western and Southern Europe. These trends were primarily associated with increases in forest cover and decreases in human population density, and, additionally, with decreases in the cover of mosaics of cropland and natural vegetation. Main conclusions: Recent land cover and human population changes appear to have altered the habitat suitability pattern for large carnivores in Europe, whereas protection level did not play a role. While projected changes largely match the observed recovery of large carnivore populations, we found mismatches with the recent expansion of wolves in Central and Southern Europe, where factors not included in our models may have played a dominant role. This suggests that large carnivores’ co-existence with humans in European landscapes is not limited by habitat availability, but other factors such as favourable human tolerance and policy

    Brown bear attacks on humans : a worldwide perspective

    Get PDF
    The increasing trend of large carnivore attacks on humans not only raises human safety concerns but may also undermine large carnivore conservation efforts. Although rare, attacks by brown bears Ursus arctos are also on the rise and, although several studies have addressed this issue at local scales, information is lacking on a worldwide scale. Here, we investigated brown bear attacks (n = 664) on humans between 2000 and 2015 across most of the range inhabited by the species: North America (n = 183), Europe (n = 291), and East (n = 190). When the attacks occurred, half of the people were engaged in leisure activities and the main scenario was an encounter with a female with cubs. Attacks have increased significantly over time and were more frequent at high bear and low human population densities. There was no significant difference in the number of attacks between continents or between countries with different hunting practices. Understanding global patterns of bear attacks can help reduce dangerous encounters and, consequently, is crucial for informing wildlife managers and the public about appropriate measures to reduce this kind of conflicts in bear country.Peer reviewe

    Deep Tree Models for 'Big' Biological Data

    No full text
    The identification of useful temporal dependence structure in discrete time series data is an important component of algorithms applied to many tasks in statistical inference and machine learning, and used in a wide variety of problems across the spectrum of biological studies. Most of the early statistical approaches were ineffective in practice, because the amount of data required for reliable modelling grew exponentially with memory length. On the other hand, many of the more modern methodological approaches that make use of more flexible and parsimonious models result in algorithms that do not scale well and are computationally ineffective for larger data sets. In this paper we describe a class of novel methodological tools for effective Bayesian inference for general discrete time series, motivated primarily by questions regarding data originating from studies in genetics and neuroscience. Our starting point is the development of a rich class of Bayesian hierarchical models for variable-memory Markov chains. The particular prior structure we adopt makes it possible to design effective, linear-time algorithms that can compute most of the important features of the relevant posterior and predictive distributions without resorting to Markov chain Monte Carlo simulation. The origin of some of these algorithms can be traced to the family of Context Tree Weighting (CTW) algorithms developed for data compression since the mid-1990s. We have used the resulting methodological tools in numerous application-specific tasks (including prediction, segmentation, classification, anomaly detection, entropy estimation, and causality testing) on data from different areas of application. The results obtained compare quite favourably with those obtained using earlier approaches, such as Probabilistic Suffix Trees (PST), Variable-Length Markov Chains (VLMC), and the class of Markov Transition Distributions (MTD)

    Bayesian context trees: Modelling and exact inference for discrete time series

    No full text
    We develop a new Bayesian modelling framework for the class of higher-order, variable-memory Markov chains, and introduce an associated collection of methodological tools for exact inference with discrete time series. We show that a version of the context tree weighting algorithm can compute the prior predictive likelihood exactly (averaged over both models and parameters), and two related algorithms are introduced, which identify the a posteriori most likely models and compute their exact posterior probabilities. All three algorithms are deterministic and have linear-time complexity. A family of variable-dimension Markov chain Monte Carlo samplers is also provided, facilitating further exploration of the posterior. The performance of the proposed methods in model selection, Markov order estimation and prediction is illustrated through simulation experiments and real-world applications with data from finance, genetics, neuroscience, and animal communication. The associated algorithms are implemented in the R package BCT

    Bayesian Context Trees: Modelling and exact inference for discrete time series

    No full text
    We develop a new Bayesian modelling framework for the class of higher-order, variable-memory Markov chains, and introduce an associated collection of methodological tools for exact inference with discrete time series. We show that a version of the context tree weighting algorithm can compute the prior predictive likelihood exactly (averaged over both models and parameters), and two related algorithms are introduced, which identify the a posteriori most likely models and compute their exact posterior probabilities. All three algorithms are deterministic and have linear-time complexity. A family of variable-dimension Markov chain Monte Carlo samplers is also provided, facilitating further exploration of the posterior. The performance of the proposed methods in model selection, Markov order estimation and prediction is illustrated through simulation experiments and real-world applications with data from finance, genetics, neuroscience, and animal communication. The associated algorithms are implemented in the R package BCT
    corecore