24 research outputs found
Bayesian mixture modeling and outliers in inter-laboratory studies
Taking measurements and using measuring devices are ubiquitous in commerce and scientific activities today. Inter-laboratory studies (especially so-called Key-Comparisons) are conducted to ensure measurement capability for commerce, evaluate national and international equivalence of measure, and validate measurement devices and measurement methods or standard materials. A common protocol employed in many inter-laboratory studies is for a pilot lab to prepare materials or objects to be measured and deliver them to participating labs. The labs proceed to take measurements and report the results to the pilot lab which performs a statistical analysis. An overarching goal of many inter-laboratory studies is to establish a reference value for some measurand (the underlying quantity subject to measurement).
In these studies, it is not unusual for one or more labs to report measurements that are unlike the majority. There is no consensus on how to handle these unusual measurements in a statistical analysis. Most methods, in one way or another, attempt to determine if each laboratory should be classified as an ``outlier and discard measurements from those labs that are so classified. The practice of excluding particular measurement results without substantive reasons is discouraged. In Key-Comparison studies, the concept of outlying laboratories must be treated even more delicately. For these, discarding outlying measurements is often politically untenable.
There is a need to develop methodologies for the analysis of inter-laboratory studies that model the potential existence of laboratory outliers in a way that doesn\u27t let them dominate the estimation of a measurand. The development of such methodologies is the general theme of this dissertation
Informed Bayesian Finite Mixture Models via Asymmetric Dirichlet Priors
Finite mixture models are flexible methods that are commonly used for
model-based clustering. A recent focus in the model-based clustering literature
is to highlight the difference between the number of components in a mixture
model and the number of clusters. The number of clusters is more relevant from
a practical stand point, but to date, the focus of prior distribution
formulation has been on the number of components. In light of this, we develop
a finite mixture methodology that permits eliciting prior information directly
on the number of clusters in an intuitive way. This is done by employing an
asymmetric Dirichlet distribution as a prior on the weights of a finite
mixture. Further, a penalized complexity motivated prior is employed for the
Dirichlet shape parameter. We illustrate the ease to which prior information
can be elicited via our construction and the flexibility of the resulting
induced prior on the number of clusters. We also demonstrate the utility of our
approach using numerical experiments and two real world data sets
Informed Bayesian Finite Mixture Models via Asymmetric Dirichlet Priors
Finite mixture models are flexible methods that are commonly used for
model-based clustering. A recent focus in the model-based clustering literature
is to highlight the difference between the number of components in a mixture
model and the number of clusters. The number of clusters is more relevant from
a practical stand point, but to date, the focus of prior distribution
formulation has been on the number of components. In light of this, we develop
a finite mixture methodology that permits eliciting prior information directly
on the number of clusters in an intuitive way. This is done by employing an
asymmetric Dirichlet distribution as a prior on the weights of a finite
mixture. Further, a penalized complexity motivated prior is employed for the
Dirichlet shape parameter. We illustrate the ease to which prior information
can be elicited via our construction and the flexibility of the resulting
induced prior on the number of clusters. We also demonstrate the utility of our
approach using numerical experiments and two real world data sets
Regression with Variable Dimension Covariates
Regression is one of the most fundamental statistical inference problems. A
broad definition of regression problems is as estimation of the distribution of
an outcome using a family of probability models indexed by covariates. Despite
the ubiquitous nature of regression problems and the abundance of related
methods and results there is a surprising gap in the literature. There are no
well established methods for regression with a varying dimension covariate
vectors, despite the common occurrence of such problems. In this paper we
review some recent related papers proposing varying dimension regression by way
of random partitions
On the Geometry of Bayesian Inference
We provide a geometric interpretation to Bayesian inference that allows us to
introduce a natural measure of the level of agreement between priors,
likelihoods, and posteriors. The starting point for the construction of our
geometry is the simple observation that the marginal likelihood can be regarded
as an inner product between the prior and the likelihood. A key concept in our
geometry is that of compatibility, a measure which is based on the same
construction principles as Pearson correlation, but which can be used to assess
how much the prior agrees with the likelihood, to gauge the sensitivity of the
posterior to the prior, and to quantify the coherency of the opinions of two
experts. Estimators for all the quantities involved in our geometric setup are
discussed, which can be directly computed from the posterior simulation output.
Some examples are used to illustrate our methods, including data related to
on-the-job drug usage, midge wing length, and prostate cancer