Search CORE

7 research outputs found

Detection and explanation of statistical differences across a pair of groups

Author: Sverchkov Yuriy
Publication venue: 'University Library System, University of Pittsburgh'
Publication date: 01/01/2014
Field of study

The task of explaining differences across groups is a task that people encounter often, not only in the research environment, but also in less formal settings. Existing statistical tools designed specifically for discovering and understanding differences are limited. The methods developed in this dissertation provide such tools and help understand what properties such tools should have to be successful and to motivate further development of new approaches to discovering and understanding differences. This dissertation presents a novel approach to comparing groups of data points. The process of comparing groups of data is divided into multiple stages: The learning of maximum a posteriori models for the data in each group, the identification of statistical differences between model parameters, the construction of a single model that captures those differences, and finally, the explanation of inferences of differences in marginal distributions in the form of an account of clinically significant contributions of elemental model differences to the marginal difference. A general framework for the process, applicable to a broad range of model types, is presented. This dissertation focuses on applying this framework to Bayesian networks over multinomial variables. To evaluate model learning and the detection of parameter differences an empirical evaluation of methods for identifying statistically significant differences and clinically significant differences is performed. To evaluate the generated explanations of how differences in the models account for the differences in probabilities computed from those models, case studies with real clinical data are presented, and the findings generated by explanations are discussed. An interactive prototype that allows a user to navigate through such an explanation is presented, and ideas are discussed for further development of data analysis tools for comparing groups of data

ProQuest OAI Repository

D-Scholarship@Pitt

A brief summary of reviewed methods.

Author: Mark Craven (51222)
Yuriy Sverchkov (4059499)
Publication venue
Publication date
Field of study

Icons arranged in the table represent individual methods. The columns represent the various experiment selection criteria, and the methods are divided vertically between de novo methods and methods that use prior knowledge. Visual elements in each icon indicate whether the method is deterministic (cog) or stochastic (die), whether it models continuous (circle) or discrete (diamond) variables, what is specified in a query for an experiment (G for genetic and E for environmental perturbations), and the dimensionality of the data used (dot array for multidimensional data and a ruler for one-dimensional data).</p

FigShare

Recombination Analysis of Herpes Simplex Virus 1 Reveals a Bias toward GC Content and the Inverted Repeat Regions

Author: Aaron W. Kolb
Curtis R. Brandt
Jacqueline A. Cuellar
Kyubin Lee
Mark Craven
R. M. Longnecker
Yuriy Sverchkov
Publication venue: 'American Society for Microbiology'
Publication date
Field of study

Crossref

Spatial cluster detection using dynamic programming

Author: D Heckerman
D Neill
DB Neill
DB Neill
DB Neill
DB Neill
ER DeLong
GF Cooper
GP Patil
Gregory F Cooper
L Duczmal
M Ester
M Kulldorff
M Kulldorff
M Kulldorff
M Kulldorff
M Kulldorff
M Kulldorff
T Pei
T Tango
WL Buntine
X Jiang
X Jiang
X Jiang
X Wang
Xia Jiang
Y Shen
Yuriy Sverchkov
Z Zhang
Publication venue: BMC
Publication date: 01/01/2012
Field of study

Abstract Background The task of spatial cluster detection involves finding spatial regions where some property deviates from the norm or the expected value. In a probabilistic setting this task can be expressed as finding a region where some event is significantly more likely than usual. Spatial cluster detection is of interest in fields such as biosurveillance, mining of astronomical data, military surveillance, and analysis of fMRI images. In almost all such applications we are interested both in the question of whether a cluster exists in the data, and if it exists, we are interested in finding the most accurate characterization of the cluster. Methods We present a general dynamic programming algorithm for grid-based spatial cluster detection. The algorithm can be used for both Bayesian maximum <it>a-posteriori </it>(MAP) estimation of the most likely spatial distribution of clusters and Bayesian model averaging over a large space of spatial cluster distributions to compute the posterior probability of an unusual spatial clustering. The algorithm is explained and evaluated in the context of a biosurveillance application, specifically the detection and identification of Influenza outbreaks based on emergency department visits. A relatively simple underlying model is constructed for the purpose of evaluating the algorithm, and the algorithm is evaluated using the model and semi-synthetic test data. Results When compared to baseline methods, tests indicate that the new algorithm can improve MAP estimates under certain conditions: the greedy algorithm we compared our method to was found to be more sensitive to smaller outbreaks, while as the size of the outbreaks increases, in terms of area affected and proportion of individuals affected, our method overtakes the greedy algorithm in spatial precision and recall. The new algorithm performs on-par with baseline methods in the task of Bayesian model averaging. Conclusions We conclude that the dynamic programming algorithm performs on-par with other available methods for spatial cluster detection and point to its low computational cost and extendability as advantages in favor of further research and use of the algorithm.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

Noninvasive Detection of Colorectal Carcinomas Using Serum Protein Biomarkers

Author: Ahnen
Allison
Amos-Landgraf
Ansar
Bibbins-Domingo
Bryant W. Megna
Edge
Emerging Risk Factors
Gregory D. Kennedy
Hanash
Hong
Inadomi
Ivancic
Ivancic
Kijima
Knudsen
Lieberman
MacLean
Mark Craven
Mark Reichelderfer
Melanie M. Ivancic
Michael R. Sussman
Mitchell
Moser
Nozoe
Pepe
Pereira
Perry J. Pickhardt
Pickhardt
Regula
Ribeiro
Sharara
Siegel
Siegel
Simon
Song
Sovich
Tan
Tevis
Tilbury
Tsilidis
Tuck
Warren
Werner
Xu
Yang
Yuriy Sverchkov
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

A review of active learning approaches to experimental design for uncovering biological networks

Author: A Rau
A Ryll
B Mélykúti
CE Shannon
CH Bryant
CH Yeang
CH Yeang
CM Deane
E Szczurek
F Steinke
FR Kschischang
G von Dassow
Haiyan Huang
I Pournara
I Xenarios
J Pearl
J Tegnér
KP Murphy
Mark Craven
MH Maathuis
N Atias
R Daly
R Dehghannasiri
R King
R Pal
R Samaga
R Sharan
RD King
RD Leclerc
S Kullback
SM Ud-Dean
TE Ideker
TI Lee
YB He
Yuriy Sverchkov
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref