75 research outputs found

    Comparison of Predicted and Observed Dioxin Levels in Fish: Implications for Risk Assessment

    Get PDF
    After comparing sampled and modelled dioxin levels in the tissue of fish near pulp and paper mill discharges, the authors argue that, until an improved bioaccumulation model is incorporated into EPA\u27s Risk assessment process, determination of human health Risks associated with consuming dioxin-contaminated fish should be based on sampling

    Smooth Quantile Ratio Estimation

    Get PDF
    In a study of health care expenditures attributable to smoking, we seek to compare the distribution of medical costs for persons with lung cancer or chronic obstructive pulmonary disease (cases) to those without (controls) using a national survey which includes hundreds of cases and thousands of controls. The distribution of costs is highly skewed toward larger values, making estimates of the mean from the smaller sample dependent on a small fraction of the biggest values. One approach to deal with the smaller sample is to rely on a simple parametric model such as the log-normal, but this makes the undesirable assumption that the distribution of the log-expenditures is symmetric. We propose a novel approach to estimate the mean difference of two highly skewed distributions (Delta), which we call Smooth Quantile Ratio Estimation (SQUARE). SQUARE is obtained by smoothing, over percentiles, the ratio of the cost quantiles of the cases and controls. SQUARE defines a large class of estimators of Delta including: 1) the sample mean difference, 2) the maximum likelihood estimate under log-normal samples, and 3) L-estimates. We detail asymptotic properties of SQUARE such as consistency and asymptotic normality, and also provide a closed form expression for the asymptotic variance. Through a simulation study, we show that SQUARE has lower mean squared error than several competitors including the sample mean difference, and log-normal parametric estimates in several realistic situations. We apply SQUARE to the 1987 National Medicare Expenditure Survey to estimate the difference in medical expenditures between persons suffering from the smoking attributable diseases, lung cancer and chronic obstructive pulmonary disease, and persons without these diseases. Software in R (Ihaka and Gentleman, 1996) for the implementation of SQUARE and of all its special cases, and the cost data used in this paper are available at http://biostat.jhsph.edu/~fdominic/square.html

    Spectral graph clustering via the Expectation-Solution algorithm

    Full text link
    The stochastic blockmodel (SBM) models the connectivity within and between disjoint subsets of nodes in networks. Prior work demonstrated that the rows of an SBM's adjacency spectral embedding (ASE) and Laplacian spectral embedding (LSE) both converge in law to Gaussian mixtures where the components are curved exponential families. Maximum likelihood estimation via the Expectation-Maximization (EM) algorithm for a full Gaussian mixture model (GMM) can then perform the task of clustering graph nodes, albeit without appealing to the components' curvature. Noting that EM is a special case of the Expectation-Solution (ES) algorithm, we propose two ES algorithms that allow us to take full advantage of these curved structures. After presenting the ES algorithm for the general curved-Gaussian mixture, we develop those corresponding to the ASE and LSE limiting distributions. Simulating from artificial SBMs and a brain connectome SBM reveals that clustering graph nodes via our ES algorithms can improve upon that of EM for a full GMM for a wide range of settings.Comment: 45 pages, version accepted by Electronic Journal of Statistic
    • …
    corecore