167 research outputs found

    Health Disparities in Colorectal Cancer Screening in United States: Race/ethnicity or Shifting Paradigms?

    Full text link
    Background: Colorectal cancer (CRC) remains the third leading cause of cancer death in the United States. The incidence, mortality, and screening vary by race/ethnicity, with African Americans and Hispanics being disproportionately represented. Early detection through screening prolongs survival and decreases mortality. CRC screening (CRCS) varies by race/ethnicity, with lower prevalence rates observed among minorities, but the factors associated with such disparities remain to be fully understood. The current study aimed to examine the ethnic/racial disparities in the prevalence of CRCS, and the explanatory factors therein in a large sample of U.S. residents, using the National Health Interview Survey, 2003. Materials and Methods: A cross-sectional, epidemiologic design was used with a chi squareto assess the prevalence of CRCS, while a survey logistic regression model was used to assess the odds of being screened. Results: There was a significant variability in CRCS, with minorities demonstrating lower prevalence relative to Caucasians χ2 (3) = 264.4, p\u3c 0.0001. After controlling for the covariates, racial/ethnic disparities in CRCS persisted. Compared to Caucasians, African Americans/Blacks were 28% (adjusted prevalence odds ratio [APOR] = 0.72, 99% CI, 0.60-0.80), while Hispanics 33% (APOR, 0.67, 99% CI, 0.53-0.84) and Asians 37% (APOR, 0.63, 99% CI, 0.43-0.95) were less likely to be screened for CRC. Conclusion: Among older Americans, racial/ethnic disparities in CRCS exist, which was unexplained by racial/ethnic variance in the covariates associated with CRCS. These findings recommend further studies in enhancing the understanding of confounders and mediators of disparities in CRCS and the application of these factors including the health belief model in improving CRCS among ethnic/racial minorities

    A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

    Full text link
    This paper introduces the Multi-Genre Natural Language Inference (MultiNLI) corpus, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding. In addition to being one of the largest corpora available for the task of NLI, at 433k examples, this corpus improves upon available resources in its coverage: it offers data from ten distinct genres of written and spoken English--making it possible to evaluate systems on nearly the full complexity of the language--and it offers an explicit setting for the evaluation of cross-genre domain adaptation.Comment: 10 pages, 1 figures, 5 tables. v2 corrects a misreported accuracy number for the CBOW model in the 'matched' setting. v3 adds a discussion of the difficulty of the corpus to the analysis section. v4 is the version that was accepted to NAACL201

    The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks

    Full text link
    NLP models have progressed drastically in recent years, according to numerous datasets proposed to evaluate performance. Questions remain, however, about how particular dataset design choices may impact the conclusions we draw about model capabilities. In this work, we investigate this question in the domain of compositional generalization. We examine the performance of six modeling approaches across 4 datasets, split according to 8 compositional splitting strategies, ranking models by 18 compositional generalization splits in total. Our results show that: i) the datasets, although all designed to evaluate compositional generalization, rank modeling approaches differently; ii) datasets generated by humans align better with each other than they with synthetic datasets, or than synthetic datasets among themselves; iii) generally, whether datasets are sampled from the same source is more predictive of the resulting model ranking than whether they maintain the same interpretation of compositionality; and iv) which lexical items are used in the data can strongly impact conclusions. Overall, our results demonstrate that much work remains to be done when it comes to assessing whether popular evaluation datasets measure what they intend to measure, and suggest that elucidating more rigorous standards for establishing the validity of evaluation sets could benefit the field.Comment: CoNLL202

    Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition

    Full text link
    Natural language inference (NLI) is an increasingly important task for natural language understanding, which requires one to infer whether a sentence entails another. However, the ability of NLI models to make pragmatic inferences remains understudied. We create an IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of >25k semiautomatically generated sentence pairs illustrating well-studied pragmatic inference types. We use IMPPRES to evaluate whether BERT, InferSent, and BOW NLI models trained on MultiNLI (Williams et al., 2018) learn to make pragmatic inferences. Although MultiNLI appears to contain very few pairs illustrating these inference types, we find that BERT learns to draw pragmatic inferences. It reliably treats scalar implicatures triggered by "some" as entailments. For some presupposition triggers like "only", BERT reliably recognizes the presupposition as an entailment, even when the trigger is embedded under an entailment canceling operator like negation. BOW and InferSent show weaker evidence of pragmatic reasoning. We conclude that NLI training encourages models to learn some, but not all, pragmatic inferences.Comment: to appear in Proceedings of ACL 202
    • …
    corecore