Search CORE

1,702 research outputs found

Recommended from our members

Identification and separation of DNA mixtures using peak area information (Updated version of Statistical Research Paper No. 25)

Author: Cowell R.
Lauritzen S. L.
Mortera J.
Publication venue: Faculty of Actuarial Science & Insurance, City University London
Publication date: 01/01/2006
Field of study

We introduce a new methodology, based upon probabilistic expert systems, for analysing forensic identification problems involving DNA mixture traces using quantitative peak area information. Peak area is modelled with conditional Gaussian distributions. The expert system can be used for ascertaining whether individuals, whose profiles have been measured, have contributed to the mixture, but also to predict DNA profiles of unknown contributors by separating the mixture into its individual components. The potential of our probabilistic methodology is illustrated on case data examples and compared with alternative approaches. The advantages are that identification and separation issues can be handled in a unified way within a single probabilistic model and the uncertainty associated with the analysis is quantified. Further work, required to bring the methodology to a point where it could be applied to the routine analysis of casework, is discussed

City Research Online

Recommended from our members

Identification and separation of DNA mixtures using peak area information

Author: Cowell R.
Lauritzen S. L.
Mortera J.
Publication venue: Faculty of Actuarial Science & Insurance, City University London
Publication date: 01/01/2004
Field of study

City Research Online

VBN

Probabilistic expert systems for handling artifacts in complex DNA mixtures

Author: Cowell R.
Lauritzen S. L.
Mortera J.
Publication venue: Faculty of Actuarial Science & Insurance, City University London
Publication date: 01/01/2009
Field of study

This paper presents a coherent probabilistic framework for taking account of allelic dropout, stutter bands and silent alleles when interpreting STR DNA profiles from a mixture sample using peak size information arising from a PCR analysis. This information can be exploited for evaluating the evidential strength for a hypothesis that DNA from a particular person is present in the mixture. It extends an earlier Bayesian network approach that ignored such artifacts. We illustrate the use of the extended network on a published casework example

CiteSeerX

City Research Online

Maximum Likelihood Estimation in Gaussian Chain Graph Models under the Alternative Markov Property

Author: Druzdzel M. J.
Edwards D. M.
Frydenberg M.
Hojsgaard S.
Kass R. E.
Lauritzen S. L.
Lauritzen S. L.
Ostrowski A. M.
Speed T. P.
Wermuth N.
Whittaker J.
Zangwill W. I.
Publication venue: 'Wiley'
Publication date: 30/09/2005
Field of study

The AMP Markov property is a recently proposed alternative Markov property for chain graphs. In the case of continuous variables with a joint multivariate Gaussian distribution, it is the AMP rather than the earlier introduced LWF Markov property that is coherent with data-generation by natural block-recursive regressions. In this paper, we show that maximum likelihood estimates in Gaussian AMP chain graph models can be obtained by combining generalized least squares and iterative proportional fitting to an iterative algorithm. In an appendix, we give useful convergence results for iterative partial maximization algorithms that apply in particular to the described algorithm.Comment: 15 pages, article will appear in Scandinavian Journal of Statistic

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

Transfer Entropy as a Log-likelihood Ratio

Author: A. D. R. McQuarrie
J. Doob
J. Hamilton
Lionel Barnett
N. Wiener
P. Billingsley
S. L. Lauritzen
Terry Bossomaier
Publication venue: 'American Physical Society (APS)'
Publication date: 27/07/2012
Field of study

Transfer entropy, an information-theoretic measure of time-directed information transfer between joint processes, has steadily gained popularity in the analysis of complex stochastic dynamics in diverse fields, including the neurosciences, ecology, climatology and econometrics. We show that for a broad class of predictive models, the log-likelihood ratio test statistic for the null hypothesis of zero transfer entropy is a consistent estimator for the transfer entropy itself. For finite Markov chains, furthermore, no explicit model is required. In the general case, an asymptotic chi-squared distribution is established for the transfer entropy estimator. The result generalises the equivalence in the Gaussian case of transfer entropy and Granger causality, a statistical notion of causal influence based on prediction via vector autoregression, and establishes a fundamental connection between directed information transfer and causality in the Wiener-Granger sense

arXiv.org e-Print Archive

Crossref

Sussex Research Online

Network Inference via the Time-Varying Graphical Lasso

Author: Banerjee O.
Dahl J.
Hallac D.
Hsieh C.-J.
Lauritzen S. L.
Mohan K.
Mohan K.
Myers S.
Scheinberg K.
Weinberger K.
Wytock M.
Yang S.
Publication venue
Publication date: 09/06/2017
Field of study

Many important problems can be modeled as a system of interconnected entities, where each entity is recording time-dependent observations or measurements. In order to spot trends, detect anomalies, and interpret the temporal dynamics of such data, it is essential to understand the relationships between the different entities and how these relationships evolve over time. In this paper, we introduce the time-varying graphical lasso (TVGL), a method of inferring time-varying networks from raw time series data. We cast the problem in terms of estimating a sparse time-varying inverse covariance matrix, which reveals a dynamic network of interdependencies between the entities. Since dynamic network inference is a computationally expensive task, we derive a scalable message-passing algorithm based on the Alternating Direction Method of Multipliers (ADMM) to solve this problem in an efficient way. We also discuss several extensions, including a streaming algorithm to update the model and incorporate new observations in real time. Finally, we evaluate our TVGL algorithm on both real and synthetic datasets, obtaining interpretable results and outperforming state-of-the-art baselines in terms of both accuracy and scalability

arXiv.org e-Print Archive

Crossref

Bayesian Networks for Max-linear Models

Author: AP Dawid
B Finkenstädt
FW Scholz
J Beirlant
J Kiefer
JMV Hoef
KA Bollen
L Haan de
M Frydenberg
N Gissibl
P Butkovič
P Embrechts
R Huser
RD Gill
S Johansen
S Lauritzen
S Shimizu
SI Resnick
SI Resnick
SL Lauritzen
T Richardson
Y Wang
Publication venue
Publication date: 01/01/2019
Field of study

We study Bayesian networks based on max-linear structural equations as introduced in Gissibl and Kl\"uppelberg [16] and provide a summary of their independence properties. In particular we emphasize that distributions for such networks are generally not faithful to the independence model determined by their associated directed acyclic graph. In addition, we consider some of the basic issues of estimation and discuss generalized maximum likelihood estimation of the coefficients, using the concept of a generalized likelihood ratio for non-dominated families as introduced by Kiefer and Wolfowitz [21]. Finally we argue that the structure of a minimal network asymptotically can be identified completely from observational data.Comment: 18 page

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Analysis of forensic DNA mixtures with artefacts

Author: Cowell R.
Graversen T.
Lauritzen S. L.
Mortera J.
Publication venue: 'Wiley'
Publication date: 01/01/2013
Field of study

DNA is now routinely used in criminal investigations and court cases, although DNA samples taken at crime scenes are of varying quality and therefore present challenging problems for their interpretation. We present a statistical model for the quantitative peak information obtained from an electropherogram of a forensic DNA sample and illustrate its potential use for the analysis of criminal cases. In contrast with most previously used methods, we directly model the peak height information and incorporate important artefacts that are associated with the production of the electropherogram. Our model has a number of unknown parameters, and we show that these can be estimated by the method of maximum likelihood in the presence of multiple unknown individuals contributing to the sample, and their approximate standard errors calculated; the computations exploit a Bayesian network representation of the model. A case example from a UK trial, as reported in the literature, is used to illustrate the efficacy and use of the model, both in finding likelihood ratios to quantify the strength of evidence, and in the deconvolution of mixtures for finding likely profiles of the individuals contributing to the sample. Our model is readily extended to simultaneous analysis of more than one mixture as illustrated in a case example. We show that the combination of evidence from several samples may give an evidential strength which is close to that of a single-source trace and thus modelling of peak height information provides a potentially very efficient mixture analysis

arXiv.org e-Print Archive

CiteSeerX

City Research Online

Crossref

OPUS - University of Technology Sydney

Copenhagen University Research Information System

Archivio della Ricerca - Università di Roma 3

Oxford University Research Archive

Archivio della ricerca- Università di Roma La Sapienza

Hierarchical Models for Independence Structures of Networks

Author: Bishop Y. M.
Erdös P.
Holland P.
Hunter D. R.
Lauritzen S. L.
Pearl J.
Snyman J. A.
West D. B.
Publication venue
Publication date: 25/11/2019
Field of study

We introduce a new family of network models, called hierarchical network models, that allow us to represent in an explicit manner the stochastic dependence among the dyads (random ties) of the network. In particular, each member of this family can be associated with a graphical model defining conditional independence clauses among the dyads of the network, called the dependency graph. Every network model with dyadic independence assumption can be generalized to construct members of this new family. Using this new framework, we generalize the Erd\"os-R\'enyi and beta-models to create hierarchical Erd\"os-R\'enyi and beta-models. We describe various methods for parameter estimation as well as simulation studies for models with sparse dependency graphs.Comment: 19 pages, 7 figure

arXiv.org e-Print Archive

Crossref

UCL Discovery

Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data

Author: Banerjee O.
Berndt D. J.
Boyd S.
Cover T. M.
Cuturi M.
Das G.
Gray R. M.
Hsieh C.-J.
Hsieh C.-J.
Lauritzen S. L.
Mohan K.
Smyth P.
Wytock M.
Publication venue
Publication date: 14/05/2018
Field of study

Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.Comment: This revised version fixes two small typos in the published versio

arXiv.org e-Print Archive

Crossref