48,879 research outputs found

    Binscatter Regressions

    Full text link
    We introduce the \texttt{Stata} (and \texttt{R}) package \textsf{Binsreg}, which implements the binscatter methods developed in \citet*{Cattaneo-Crump-Farrell-Feng_2019_Binscatter}. The package includes the commands \texttt{binsreg}, \texttt{binsregtest}, and \texttt{binsregselect}. The first command (\texttt{binsreg}) implements binscatter for the regression function and its derivatives, offering several point estimation, confidence intervals and confidence bands procedures, with particular focus on constructing binned scatter plots. The second command (\texttt{binsregtest}) implements hypothesis testing procedures for parametric specification and for nonparametric shape restrictions of the unknown regression function. Finally, the third command (\texttt{binsregselect}) implements data-driven number of bins selectors for binscatter implementation using either quantile-spaced or evenly-spaced binning/partitioning. All the commands allow for covariate adjustment, smoothness restrictions, weighting and clustering, among other features. A companion \texttt{R} package with the same capabilities is also available

    Reference values: a review

    Get PDF
    Reference values are used to describe the dispersion of variables in healthy individuals. They are usually reported as population-based reference intervals (RIs) comprising 95% of the healthy population. International recommendations state the preferred method as a priori nonparametric determination from at least 120 reference individuals, but acceptable alternative methods include transference or validation from previously established RIs. The most critical steps in the determination of reference values are the selection of reference individuals based on extensively documented inclusion and exclusion criteria and the use of quality-controlled analytical procedures. When only small numbers of values are available, RIs can be estimated by new methods, but reference limits thus obtained may be highly imprecise. These recommendations are a challenge in veterinary clinical pathology, especially when only small numbers of reference individuals are available

    Revisiting the contribution of transpiration to global terrestrial evapotranspiration

    Get PDF
    Even though knowing the contributions of transpiration (T), soil and open water evaporation (E), and interception (I) to terrestrial evapotranspiration (ET=T+E+I) is crucial for understanding the hydrological cycle and its connection to ecological processes, the fraction of T is unattainable by traditional measurement techniques over large scales. Previously reported global mean T/(E+T+I) from multiple independent sources, including satellite-based estimations, reanalysis, land surface models, and isotopic measurements, varies substantially from 24% to 90%. Here we develop a new ET partitioning algorithm, which combines global evapotranspiration estimates and relationships between leaf area index (LAI) and T/(E+T) for different vegetation types, to upscale a wide range of published site-scale measurements. We show that transpiration accounts for about 57.2% (with standard deviation6.8%) of global terrestrial ET. Our approach bridges the scale gap between site measurements and global model simulations,and can be simply implemented into current global climate models to improve biological CO2 flux simulations

    Improved model identification for non-linear systems using a random subsampling and multifold modelling (RSMM) approach

    Get PDF
    In non-linear system identification, the available observed data are conventionally partitioned into two parts: the training data that are used for model identification and the test data that are used for model performance testing. This sort of 'hold-out' or 'split-sample' data partitioning method is convenient and the associated model identification procedure is in general easy to implement. The resultant model obtained from such a once-partitioned single training dataset, however, may occasionally lack robustness and generalisation to represent future unseen data, because the performance of the identified model may be highly dependent on how the data partition is made. To overcome the drawback of the hold-out data partitioning method, this study presents a new random subsampling and multifold modelling (RSMM) approach to produce less biased or preferably unbiased models. The basic idea and the associated procedure are as follows. First, generate K training datasets (and also K validation datasets), using a K-fold random subsampling method. Secondly, detect significant model terms and identify a common model structure that fits all the K datasets using a new proposed common model selection approach, called the multiple orthogonal search algorithm. Finally, estimate and refine the model parameters for the identified common-structured model using a multifold parameter estimation method. The proposed method can produce robust models with better generalisation performance

    Maximum likelihood and pseudo score approaches for parametric time-to-event analysis with informative entry times

    Full text link
    We develop a maximum likelihood estimating approach for time-to-event Weibull regression models with outcome-dependent sampling, where sampling of subjects is dependent on the residual fraction of the time left to developing the event of interest. Additionally, we propose a two-stage approach which proceeds by iteratively estimating, through a pseudo score, the Weibull parameters of interest (i.e., the regression parameters) conditional on the inverse probability of sampling weights; and then re-estimating these weights (given the updated Weibull parameter estimates) through the profiled full likelihood. With these two new methods, both the estimated sampling mechanism parameters and the Weibull parameters are consistently estimated under correct specification of the conditional referral distribution. Standard errors for the regression parameters are obtained directly from inverting the observed information matrix in the full likelihood specification and by either calculating bootstrap or robust standard errors for the hybrid pseudo score/profiled likelihood approach. Loss of efficiency with the latter approach is considered. Robustness of the proposed methods to misspecification of the referral mechanism and the time-to-event distribution is also briefly examined. Further, we show how to extend our methods to the family of parametric time-to-event distributions characterized by the generalized gamma distribution. The motivation for these two approaches came from data on time to cirrhosis from hepatitis C viral infection in patients referred to the Edinburgh liver clinic. We analyze these data here.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS725 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Multi-View Face Recognition From Single RGBD Models of the Faces

    Get PDF
    This work takes important steps towards solving the following problem of current interest: Assuming that each individual in a population can be modeled by a single frontal RGBD face image, is it possible to carry out face recognition for such a population using multiple 2D images captured from arbitrary viewpoints? Although the general problem as stated above is extremely challenging, it encompasses subproblems that can be addressed today. The subproblems addressed in this work relate to: (1) Generating a large set of viewpoint dependent face images from a single RGBD frontal image for each individual; (2) using hierarchical approaches based on view-partitioned subspaces to represent the training data; and (3) based on these hierarchical approaches, using a weighted voting algorithm to integrate the evidence collected from multiple images of the same face as recorded from different viewpoints. We evaluate our methods on three datasets: a dataset of 10 people that we created and two publicly available datasets which include a total of 48 people. In addition to providing important insights into the nature of this problem, our results show that we are able to successfully recognize faces with accuracies of 95% or higher, outperforming existing state-of-the-art face recognition approaches based on deep convolutional neural networks
    corecore