Search CORE

591 research outputs found

Building and using semiparametric tolerance regions for parametric multinomial models

Author: Lindsay Bruce G.
Liu Jiawei
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 03/09/2009
Field of study

We introduce a semiparametric ``tubular neighborhood'' of a parametric model in the multinomial setting. It consists of all multinomial distributions lying in a distance-based neighborhood of the parametric model of interest. Fitting such a tubular model allows one to use a parametric model while treating it as an approximation to the true distribution. In this paper, the Kullback--Leibler distance is used to build the tubular region. Based on this idea one can define the distance between the true multinomial distribution and the parametric model to be the index of fit. The paper develops a likelihood ratio test procedure for testing the magnitude of the index. A semiparametric bootstrap method is implemented to better approximate the distribution of the LRT statistic. The approximation permits more accurate construction of a lower confidence limit for the model fitting index.Comment: Published in at http://dx.doi.org/10.1214/08-AOS603 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

The topography of multivariate normal mixtures

Author: Lindsay Bruce G.
Ray Surajit
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2005
Field of study

Multivariate normal mixtures provide a flexible method of fitting high-dimensional data. It is shown that their topography, in the sense of their key features as a density, can be analyzed rigorously in lower dimensions by use of a ridgeline manifold that contains all critical points, as well as the ridges of the density. A plot of the elevations on the ridgeline shows the key features of the mixed density. In addition, by use of the ridgeline, we uncover a function that determines the number of modes of the mixed density when there are two components being mixed. A followup analysis then gives a curvature function that can be used to prove a set of modality theorems.Comment: Published at http://dx.doi.org/10.1214/009053605000000417 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Carolina Digital Repository

Enlighten

Estimating the number of classes

Author: Lindsay Bruce G.
Mao Chang Xuan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

Estimating the unknown number of classes in a population has numerous important applications. In a Poisson mixture model, the problem is reduced to estimating the odds that a class is undetected in a sample. The discontinuity of the odds prevents the existence of locally unbiased and informative estimators and restricts confidence intervals to be one-sided. Confidence intervals for the number of classes are also necessarily one-sided. A sequence of lower bounds to the odds is developed and used to define pseudo maximum likelihood estimators for the number of classes.Comment: Published at http://dx.doi.org/10.1214/009053606000001280 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Improving cross-validated bandwidth selection using subsampling-extrapolation techniques

Author: Lindsay Bruce G.
Wang Qing
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 30/09/2015
Field of study

AbstractCross-validation methodologies have been widely used as a means of selecting tuning parameters in nonparametric statistical problems. In this paper we focus on a new method for improving the reliability of cross-validation. We implement this method in the context of the kernel density estimator, where one needs to select the bandwidth parameter so as to minimize L2 risk. This method is a two-stage subsampling-extrapolation bandwidth selection procedure, which is realized by first evaluating the risk at a fictional sample size m(m≤sample size n) and then extrapolating the optimal bandwidth from m to n. This two-stage method can dramatically reduce the variability of the conventional unbiased cross-validation bandwidth selector. This simple first-order extrapolation estimator is equivalent to the rescaled “bagging-CV” bandwidth selector in Hall and Robinson (2009) if one sets the bootstrap size equal to the fictional sample size. However, our simplified expression for the risk estimator enables us to compute the aggregated risk without any bootstrapping. Furthermore, we developed a second-order extrapolation technique as an extension designed to improve the approximation of the true optimal bandwidth. To select the optimal choice of the fictional size m given a sample of size n, we propose a nested cross-validation methodology. Based on simulation study, the proposed new methods show promising performance across a wide selection of distributions. In addition, we also investigated the asymptotic properties of the proposed bandwidth selectors

Elsevier - Publisher Connector

Quadratic distances on probabilities: A unified foundation

Author: Chen Shu-Chuan
Lindsay Bruce G.
Markatou Marianthi
Ray Surajit
Yang Ke
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

This work builds a unified framework for the study of quadratic form distance measures as they are used in assessing the goodness of fit of models. Many important procedures have this structure, but the theory for these methods is dispersed and incomplete. Central to the statistical analysis of these distances is the spectral decomposition of the kernel that generates the distance. We show how this determines the limiting distribution of natural goodness-of-fit tests. Additionally, we develop a new notion, the spectral degrees of freedom of the test, based on this decomposition. The degrees of freedom are easy to compute and estimate, and can be used as a guide in the construction of useful procedures in this class.Comment: Published in at http://dx.doi.org/10.1214/009053607000000956 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Enlighten

Recommended from our members

14C-Cobalamin Absorption from Endogenously Labeled Chicken Eggs Assessed in Humans Using Accelerator Mass Spectrometry.

Author: Allen Lindsay H
Buchholz Bruce A
Calvert Christopher C
Garrod Marjorie G
Green Ralph
Miller Joshua W
Rossow Heidi A
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

Traditionally, the bioavailability of vitamin B-12 (B12) from in vivo labeled foods was determined by labeling the vitamin with radiocobalt (57Co, 58Co or 60Co). This required use of penetrating radioactivity and sometimes used higher doses of B12 than the physiological limit of B12 absorption. The aim of this study was to determine the bioavailability and absorbed B12 from chicken eggs endogenously labeled with 14C-B12 using accelerator mass spectrometry (AMS). 14C-B12 was injected intramuscularly into hens to produce eggs enriched in vivo with the 14C labeled vitamin. The eggs, which provided 1.4 to 2.6 μg of B12 (~1.1 kBq) per serving, were scrambled, cooked and fed to 10 human volunteers. Baseline and post-ingestion blood, urine and stool samples were collected over a one-week period and assessed for 14C-B12 content using AMS. Bioavailability ranged from 13.2 to 57.7% (mean 30.2 ± 16.4%). Difference among subjects was explained by dose of B12, with percent bioavailability from 2.6 μg only half that from 1.4 μg. The total amount of B12 absorbed was limited to 0.5-0.8 μg (mean 0.55 ± 0.19 μg B12) and was relatively unaffected by the amount consumed. The use of 14C-B12 offers the only currently available method for quantifying B12 absorption in humans, including food cobalamin absorption. An egg is confirmed as a good source of B12, supplying approximately 20% of the average adult daily requirement (RDA for adults = 2.4 μg/day)

eScholarship - University of California

Detecting West Nile Virus in Owls and Raptors by an Antigen-capture Assay

Author: Ady Y. Gancz
Bruce Hunter
Douglas G. Campbell
Gancz
Ian K. Barker
Komar
Lanciotti
Lindsay
Nasci
Robbin Lindsay
Ryan
Publication venue: Centers for Disease Control and Prevention
Publication date: 01/12/2004
Field of study

We evaluated a rapid antigen-capture assay (VecTest) for detection of West Nile virus in oropharyngeal and cloacal swabs, collected at necropsy from owls (N = 93) and raptors (N = 27). Sensitivity was 93.5%–95.2% for northern owl species but <42.9% for all other species. Specificity was 100% for owls and 85.7% for raptors

Crossref

Directory of Open Access Journals

PubMed Central

Partial Purification and Characterization of Two Peptide Hydrolases from Pea Seeds

Author: J. Bruce Caldwell
Lindsay G. Sparrow
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date
Field of study

Crossref

Gene capture prediction and overlap estimation in EST sequencing from one or multiple libraries

Author: Cui Liying
dePamphilis Claude W
Lindsay Bruce G
Marion Josh
Wall P Kerr
Wang Ji-Ping Z
Zhang Jiaxuan
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: In expressed sequence tag (EST) sequencing, we are often interested in how many genes we can capture in an EST sample of a targeted size. This information provides insights to sequencing efficiency in experimental design, as well as clues to the diversity of expressed genes in the tissue from which the library was constructed. RESULTS: We propose a compound Poisson process model that can accurately predict the gene capture in a future EST sample based on an initial EST sample. It also allows estimation of the number of expressed genes in one cDNA library or co-expressed in two cDNA libraries. The superior performance of the new prediction method over an existing approach is established by a simulation study. Our analysis of four Arabidopsis thaliana EST sets suggests that the number of expressed genes present in four different cDNA libraries of Arabidopsis thaliana varies from 9155 (root) to 12005 (silique). An observed fraction of co-expressed genes in two different EST sets as low as 25% can correspond to an actual overlap fraction greater than 65%. CONCLUSION: The proposed method provides a convenient tool for gene capture prediction and cDNA library property diagnosis in EST sequencing

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central