Search CORE

153 research outputs found

Health Disparities in Colorectal Cancer Screening in United States: Race/ethnicity or Shifting Paradigms?

Author: Dabney Kirk
Laurens Holmes, Jr
Williams Adina
Publication venue: Digital Scholarship@UNLV
Publication date: 27/08/2013
Field of study

Background: Colorectal cancer (CRC) remains the third leading cause of cancer death in the United States. The incidence, mortality, and screening vary by race/ethnicity, with African Americans and Hispanics being disproportionately represented. Early detection through screening prolongs survival and decreases mortality. CRC screening (CRCS) varies by race/ethnicity, with lower prevalence rates observed among minorities, but the factors associated with such disparities remain to be fully understood. The current study aimed to examine the ethnic/racial disparities in the prevalence of CRCS, and the explanatory factors therein in a large sample of U.S. residents, using the National Health Interview Survey, 2003. Materials and Methods: A cross-sectional, epidemiologic design was used with a chi squareto assess the prevalence of CRCS, while a survey logistic regression model was used to assess the odds of being screened. Results: There was a significant variability in CRCS, with minorities demonstrating lower prevalence relative to Caucasians χ2 (3) = 264.4, p\u3c 0.0001. After controlling for the covariates, racial/ethnic disparities in CRCS persisted. Compared to Caucasians, African Americans/Blacks were 28% (adjusted prevalence odds ratio [APOR] = 0.72, 99% CI, 0.60-0.80), while Hispanics 33% (APOR, 0.67, 99% CI, 0.53-0.84) and Asians 37% (APOR, 0.63, 99% CI, 0.43-0.95) were less likely to be screened for CRC. Conclusion: Among older Americans, racial/ethnic disparities in CRCS exist, which was unexplained by racial/ethnic variance in the covariates associated with CRCS. These findings recommend further studies in enhancing the understanding of confounders and mediators of disparities in CRCS and the application of these factors including the health belief model in improving CRCS among ethnic/racial minorities

University of Nevada, Las Vegas Repository

A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

Author: Bowman Samuel R.
Nangia Nikita
Williams Adina
Publication venue
Publication date: 01/01/2018
Field of study

This paper introduces the Multi-Genre Natural Language Inference (MultiNLI) corpus, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding. In addition to being one of the largest corpora available for the task of NLI, at 433k examples, this corpus improves upon available resources in its coverage: it offers data from ten distinct genres of written and spoken English--making it possible to evaluate systems on nearly the full complexity of the language--and it offers an explicit setting for the evaluation of cross-genre domain adaptation.Comment: 10 pages, 1 figures, 5 tables. v2 corrects a misreported accuracy number for the CBOW model in the 'matched' setting. v3 adds a discussion of the difficulty of the corpus to the analysis section. v4 is the version that was accepted to NAACL201

arXiv.org e-Print Archive

Crossref

The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks

Author: Hupkes Dieuwke
Sun Kaiser
Williams Adina
Publication venue
Publication date: 26/10/2023
Field of study

NLP models have progressed drastically in recent years, according to numerous datasets proposed to evaluate performance. Questions remain, however, about how particular dataset design choices may impact the conclusions we draw about model capabilities. In this work, we investigate this question in the domain of compositional generalization. We examine the performance of six modeling approaches across 4 datasets, split according to 8 compositional splitting strategies, ranking models by 18 compositional generalization splits in total. Our results show that: i) the datasets, although all designed to evaluate compositional generalization, rank modeling approaches differently; ii) datasets generated by humans align better with each other than they with synthetic datasets, or than synthetic datasets among themselves; iii) generally, whether datasets are sampled from the same source is more predictive of the resulting model ranking than whether they maintain the same interpretation of compositionality; and iv) which lexical items are used in the data can strongly impact conclusions. Overall, our results demonstrate that much work remains to be done when it comes to assessing whether popular evaluation datasets measure what they intend to measure, and suggest that elucidating more rigorous standards for establishing the validity of evaluation sets could benefit the field.Comment: CoNLL202

arXiv.org e-Print Archive

Recommended from our members

ANLIzing the Adversarial Natural Language Inference Dataset

Author: Kiela Douwe
Thrush Tristan
Williams Adina
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/02/2022
Field of study

We perform an in-depth error analysis of the Adversarial NLI (ANLI) dataset, a recently introduced large-scale human-and-model-in-the-loop natural language inference dataset collected dynamically over multiple rounds. We propose a fine-grained annotation scheme for the different aspects of inference responsible for the gold classification labels, and use it to hand-code the ANLI development sets in their entirety. We use these annotations to answer a variety of important questions: which models have the highest performance on each inference type, which inference types are most common, and which types are the most challenging for state-of-the-art models? We hope our annotations will enable more fine-grained evaluation of NLI models, and provide a deeper understanding of where models fail (and succeed). Both insights can guide us in training stronger models going forward

ScholarWorks@UMass Amherst

Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition

Author: Bhooshan Suvrat
Jeretic Paloma
Warstadt Alex
Williams Adina
Publication venue
Publication date: 01/01/2020
Field of study

Natural language inference (NLI) is an increasingly important task for natural language understanding, which requires one to infer whether a sentence entails another. However, the ability of NLI models to make pragmatic inferences remains understudied. We create an IMPlicature and PRESupposition diagnostic dataset (IMPPRES), consisting of >25k semiautomatically generated sentence pairs illustrating well-studied pragmatic inference types. We use IMPPRES to evaluate whether BERT, InferSent, and BOW NLI models trained on MultiNLI (Williams et al., 2018) learn to make pragmatic inferences. Although MultiNLI appears to contain very few pairs illustrating these inference types, we find that BERT learns to draw pragmatic inferences. It reliably treats scalar implicatures triggered by "some" as entailments. For some presupposition triggers like "only", BERT reliably recognizes the presupposition as an entailment, even when the trigger is embedded under an entailment canceling operator like negation. BOW and InferSent show weaker evidence of pragmatic reasoning. We conclude that NLI training encourages models to learn some, but not all, pragmatic inferences.Comment: to appear in Proceedings of ACL 202

arXiv.org e-Print Archive

Crossref