14 research outputs found

    Histospline Method in Nonparametric Regression Models with Application to Clustered/Longitudinal Data

    Get PDF
    Kernel and smoothing methods for nonparametric function and curve estimation have been particularly successful in standard settings, where function values are observed subject to independent errors. However, when aspects of the function are known parametrically, or where the sampling scheme has significant structure, it can be quite difficult to adapt standard methods in such a way that they retain good statistical performance and continue to enjoy easy computability and good numerical properties. In particular, when using local linear modeling it is often awkward to both respect the sampling scheme and produce an estimator with good variance properties, without resorting to iterative methods: a good case in point is longitudinal and clustered data. In this paper we suggest a simple approach to overcoming these problems. Using a histospline technique we convert a problem in the continuum to one that is governed by only a finite number of parameters, and which is often explicitly solvable. The simple expedient of running a local linear smoother through the histospline produces a function estimator which achieves optimal nonparametric properties, and the raw histospline-based estimator of the semiparametric component itself attains optimal semiparametric performance. The function estimator can be used in its own right or as the starting value for an iterative scheme based on a different approach to inference

    The Lung Image Database Consortium (LIDC):A comparison of different size metrics for pulmonary nodule measurements

    Get PDF
    RATIONALE AND OBJECTIVES: To investigate the effects of choosing between different metrics in estimating the size of pulmonary nodules as a factor both of nodule characterization and of performance of computer aided detection systems, since the latters are always qualified with respect to a given size range of nodules. MATERIALS AND METHODS: This study used 265 whole-lung CT scans documented by the Lung Image Database Consortium using their protocol for nodule evaluation. Each inspected lesion was reviewed independently by four experienced radiologists who provided boundary markings for nodules larger than 3 mm. Four size metrics, based on the boundary markings, were considered: a uni-dimensional and two bi-dimensional measures on a single image slice and a volumetric measurement based on all the image slices. The radiologist boundaries were processed and those with four markings were analyzed to characterize the inter-radiologist variation, while those with at least one marking were used to examine the difference between the metrics. RESULTS: The processing of the annotations found 127 nodules marked by all of the four radiologists and an extended set of 518 nodules each having at least one observation with three-dimensional sizes ranging from 2.03 to 29.4 mm (average 7.05 mm, median 5.71 mm). A very high inter-observer variation was observed for all these metrics: 95% of estimated standard deviations were in the following ranges [0.49, 1.25], [0.67, 2.55], [0.78, 2.11], and [0.96, 2.69] for the three-dimensional, the uni-dimensional, and the two bi-dimensional size metrics respectively (in mm). Also a very large difference among the metrics was observed: 0.95 probability-coverage region widths for the volume estimation conditional on uni-dimensional, and the two bi-dimensional size measurements of 10mm were 7.32, 7.72, and 6.29 mm respectively. CONCLUSIONS: The selection of data subsets for performance evaluation is highly impacted by the size metric choice. The LIDC plans to include a single size measure for each nodule in its database. This metric is not intended as a gold standard for nodule size; rather, it is intended to facilitate the selection of unique repeatable size limited nodule subsets

    On estimation in Binary autologistic spatial models

    No full text
    There is a large and increasing literature in methods of estimation for spatial data with binary responses. The goal of this article is to describe some of these methods for the autologistic spatial model, and to discuss computational issues associated with them. The main way we do this is via illustration using a spatial epidemiology data set involving liver cancer. We first demonstrate why Maximum Likelihood is not currently feasible as a method of estimation in the spatial setting with binary data using the autologistic model. We then discuss alternative methods, including Pseudo Likelihood, Generalized Pseudo Likelihood, and Monte Carlo Maximum Likelihood estimators. We describe their asymptotic efficiencies and the computational effort required to compute them. These three methods are applied to the data set and compared in a simulation experiment
    corecore