16 research outputs found

    Honest inference for discrete outcomes

    Get PDF
    We investigate the consequences of discreteness in the assignment variable in regression-discontinuity designs for cases where the outcome variable is itself discrete. We find that constructing confidence intervals that have the correct level of coverage in these cases is sensitive to the assumed distribution of unobserved heterogeneity. Since local linear estimators are improperly centered, a smaller variance for unobserved heterogeneity in discrete outcomes actually requires larger confidence intervals, since standard confidence intervals become narrower around a biased estimator, leading to a higher-than-nominal false positive rate. We provide a method for mapping structural assumptions regarding the distribution and variance of unobserved heterogeneity to the construction of "honest" confidence intervals that have the correct level of coverage. An application to retirement behavior reveals that the spike in retirement at age 62 in the United States can be reconciled with a wider range of values for the variance of unobserved heterogeneity (due to reservation wages or offers) than the spike at age 65

    Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution

    Get PDF
    Models of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of interest and for which only a smaller dataset is available. Thus the alignment-specific model is obtained as a weighted sum of the basis matrices. Having been constrained to vary along only as many dimensions as the data justify, the model has far fewer parameters than would be required to estimate a specialist model. We show that our NNMF procedure produces models that outperform existing methods on all but one of 50 test alignments. The basis matrices we obtain confirm the expectation that amino acid properties tend to be conserved, and allow us to quantify, on specific alignments, how the strength of conservation varies across different properties. We also apply our new models to phylogeny inference and show that the resulting phylogenies are different from, and have improved likelihood over, those inferred under standard models

    Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans

    Get PDF
    Genome-wide association studies (GWAS) have identified numerous common prostate cancer (PrCa) susceptibility loci. We have fine-mapped 64 GWAS regions known at the conclusion of the iCOGS study using large-scale genotyping and imputation in 25 723 PrCa cases and 26 274 controls of European ancestry. We detected evidence for multiple independent signals at 16 regions, 12 of which contained additional newly identified significant associations. A single signal comprising a spectrum of correlated variation was observed at 39 regions; 35 of which are now described by a novel more significantly associated lead SNP, while the originally reported variant remained as the lead SNP only in 4 regions. We also confirmed two association signals in Europeans that had been previously reported only in East-Asian GWAS. Based on statistical evidence and linkage disequilibrium (LD) structure, we have curated and narrowed down the list of the most likely candidate causal variants for each region. Functional annotation using data from ENCODE filtered for PrCa cell lines and eQTL analysis demonstrated significant enrichment for overlap with bio-features within this set. By incorporating the novel risk variants identified here alongside the refined data for existing association signals, we estimate that these loci now explain ∼38.9% of the familial relative risk of PrCa, an 8.9% improvement over the previously reported GWAS tag SNPs. This suggests that a significant fraction of the heritability of PrCa may have been hidden during the discovery phase of GWAS, in particular due to the presence of multiple independent signals within the same regio

    Using administrative data to examine telemedicine usage among Medicaid beneficiaries during the Coronavirus Disease 2019 pandemic

    No full text
    Background: The coronavirus disease 2019 (COVID-19) pandemic necessitated the replacement of in-person physician consultations with telemedicine. During the pandemic, Medicaid covered the cost of telemedicine visits. Objectives: The aim was to measure the adoption of telemedicine during the pandemic. We focus on key patient subgroups including those with chronic conditions, those living in urban versus rural areas, and different age groups. Methods: This study examined the universe of claims made by Florida Medicaid beneficiaries (n=2.4 million) between January 2019 and July 2020. Outpatient visits were identified as in-person or telemedicine. Telemedicine visits were classified into audio-visual or audio-only visits. Results: We find that telemedicine offsets much of the decline in in-person outpatient visits among Florida’s Medicaid enrollees, however, uptake differs by enrollee type. High utilizers of care and beneficiaries with chronic conditions were significantly more likely to use telemedicine, while enrollees living in rural areas and health professional shortage areas were moderately less likely to use telemedicine. Elderly Medicaid recipients (dual-eligibles) used audio-only telemedicine visits at higher rates than other age groups, and the demand for these consultations is more persistent. Conclusions: Telemedicine offset the decline in health care utilization among Florida’s Medicaid-enrolled population during the novel coronavirus pandemic, with particularly high uptake among those with prior histories of high utilization. Audio-only visits are a potentially important method of delivery for the oldest Medicaid beneficiaries

    Self assessment of strengths, weaknesses and self confidence of primary care physicians taking care of rheumatic diseases Autoevaluación de fortalezas, debilidades y confianza de los médicos de atención primaria en el abordaje de enfermedades reumatológic

    No full text
    Background: Rheumatologic diseases are common and frequently managed by primary care physicians. Aim: To assess strengths, weaknesses and self confidence of primary care physicians in the management of rheumatic diseases. Material and methods: A self assessment and anonymous questionnaire was mailed to primary care physicians of two Chilean regions. Using a 10 points Likert scale, they were asked about personal interest, undergraduate training, continuous medical education, availability of medical literature, complementary laboratory tests and consultation with a rheumatologist. Medical skills, knowledge, therapeutic approach and performance of rheumatologic procedures were evaluated under the item confidence. Results: Three hundred forty seven out of 763 physicians (45%) answered the questionnaire. Their age range extended from 25 to 75 years, 59% were male, 58% were Chilean and 74% worked in the Metropolitan region. The worst evaluated parameters were availability of literature wit

    for all models with gamma rate variation (4 categories).

    No full text
    <p>Each table entry is the number of datasets with in that range. For any dataset, the best model has . A model with has essentially no support.</p

    NNMF basis matrices.

    No full text
    <p>The set of NNMF basis matrices obtained for ranks ranging from 1 to 5. Amino acids are ordered according to their Stanfel classification <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0028898#pone.0028898-Stanfel1" target="_blank">[25]</a>. Rates are indicated in grayscale, with pure white being a rate of zero and pure black being the maximum rate in the matrix.</p

    scores for all models.

    No full text
    <p>Each table entry is the number of datasets with in that range. For any dataset, the best model has . A model with has essentially no support.</p
    corecore