401,382 research outputs found

    Factored expectation propagation for input-output FHMM models in systems biology

    Full text link
    We consider the problem of joint modelling of metabolic signals and gene expression in systems biology applications. We propose an approach based on input-output factorial hidden Markov models and propose a structured variational inference approach to infer the structure and states of the model. We start from the classical free form structured variational mean field approach and use a expectation propagation to approximate the expectations needed in the variational loop. We show that this corresponds to a factored expectation constrained approximate inference. We validate our model through extensive simulations and demonstrate its applicability on a real world bacterial data set

    Structured count data regression

    Get PDF
    Overdispersion in count data regression is often caused by neglection or inappropriate modelling of individual heterogeneity, temporal or spatial correlation, and nonlinear covariate effects. In this paper, we develop and study semiparametric count data models which can deal with these issues by incorporating corresponding components in structured additive form into the predictor. The models are fully Bayesian and inference is carried out by computationally efficient MCMC techniques. In a simulation study, we investigate how well the different components can be identified with the data at hand. The approach is applied to a large data set of claim frequencies from car insurance

    Generalized structured additive regression based on Bayesian P-splines

    Get PDF
    Generalized additive models (GAM) for modelling nonlinear effects of continuous covariates are now well established tools for the applied statistician. In this paper we develop Bayesian GAM's and extensions to generalized structured additive regression based on one or two dimensional P-splines as the main building block. The approach extends previous work by Lang und Brezger (2003) for Gaussian responses. Inference relies on Markov chain Monte Carlo (MCMC) simulation techniques, and is either based on iteratively weighted least squares (IWLS) proposals or on latent utility representations of (multi)categorical regression models. Our approach covers the most common univariate response distributions, e.g. the Binomial, Poisson or Gamma distribution, as well as multicategorical responses. For the first time, we present Bayesian semiparametric inference for the widely used multinomial logit models. As we will demonstrate through two applications on the forest health status of trees and a space-time analysis of health insurance data, the approach allows realistic modelling of complex problems. We consider the enormous flexibility and extendability of our approach as a main advantage of Bayesian inference based on MCMC techniques compared to more traditional approaches. Software for the methodology presented in the paper is provided within the public domain package BayesX

    Bootstrapping Information Extraction from Field Books

    Get PDF
    We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does exist a database filled with information derived from the type of documents to be processed. One approach employs standard supervised learning for information extraction by artificially constructing labelled training data from the contents of the database. The second approach combines unsupervised Hidden Markov modelling with language models. Empirical evaluation of both systems suggests that it is possible to bootstrap a field segmenter from a database alone. The combination of Hidden Markov and language modelling was found to perform best at this task.

    Socio-economic impacts of alternative GIN control practices. Project deliverable 11 (WP4)

    Get PDF
    This report is a deliverable (WP4) from the EU-funded PrOPara project. The PrOPara project aspires to i) assess existing knowledge from research, development and benchmarking studies on alternatives to parasite control on organic ruminant farms, ii) collecting novel data on disease prevalence, risk assessment analysis and parasite control measures, through monitoring (farm surveys and stakeholder participation studies), iii) performing cost-benefit analysis on alternative parasite control measures and iv) developing and delivering technical innovation to facilitate implementation of sustainable parasite control strategies. A combined approach of modelling and focus groups for feedback was employed to assess the economic impacts of alternative GIN control strategies in South West France and North East Scotland. This two step method allowed results from the survey and farm modelling to be used during workshops, which also addressed social factors explaining the uptake and acceptance of GIN practices to control parasites. An existing excel based farm model was adapted in order to estimate the economic impacts of a range of alternative GIN practices. The model was adapted using data from a typical farm for organic goat system in France (Occitanie and Auvergne-RhĆ“ne-Alpes Regions) and two organic sheep systems (lowland and upland) in Scotland. A structured workshop approach was utilised to address both the social and economic factors related to adoption of alternative GIN practices by farmers. To this purpose, we adapted the Structured Decision Making (SDM) approach commonly used for decisions taking (Gregory and Keeney 1994, Conroy, Barker et al. 2008, Ogden and Innes 2009, Gregory 2012, Johnson, Eaton et al. 2015, Fatorić and Seekamp 2017). Overall, the modelling and farmer feedback showed that control of GIN needs to be farm specific, to suit the individual characteristics of both the farm but also the beliefs of the farmer. The extension of withdrawal periods combined with resistance issues in France have led to the adoption of TST by some farmers, but others are less convinced of its efficiency. The farmers in Scotland seem to have adopted multiple strategies such as use of arable land and mixed grazing to keep GIN levels from severely affecting their profits. However, the diversity of opinions and calls by the French farmers in particular for more trials, shows there is still further work to understand this problem and develop more effective, sustainable solutions

    SynopSys: Foundations for Multidimensional Graph Analytics

    Get PDF
    The past few years have seen a tremendous increase in often irregularly structured data that can be represented most naturally and efficiently in the form of graphs. Making sense of incessantly growing graphs is not only a key requirement in applications like social media analysis or fraud detection but also a necessity in many traditional enterprise scenarios. Thus, a flexible approach for multidimensional analysis of graph data is needed. Whereas many existing technologies require up-front modelling of analytical scenarios and are difficult to adapt to changes, our approach allows for ad-hoc analytical queries of graph data. Extending our previous work on graph summarization, in this position paper we lay the foundation for large graph analytics to enable business intelligence on graph-structured data

    Is OpenSDE an alternative for dedicated medical research databases? An example in coronary surgery

    Get PDF
    Background. When using a conventional relational database approach to collect and query data in the context of specific clinical studies, a study with a new data set usually requires the design of a new database and entry forms. OpenSDE (SDE = Structured Data Entry) is intended to provide a flexible and intuitive way to create databases and entry forms for the collection of data in a structured format. This study illustrates the use of OpenSDE as a potential alternative to a conventional approach with respect to data modelling, database creation, data entry, and data extraction. Methods. A database and entry forms are created using OpenSDE and MSAccess to support collection of coronary surgery data, based on the Adult Cardiac Surgery Data Set of the Society of Thoracic Surgeons. Data of 52 cases are entered and nine different queries are designed, and executed on
    • ā€¦
    corecore