5 research outputs found
Check-COVID: Fact-Checking COVID-19 News Claims with Scientific Evidence
We present a new fact-checking benchmark, Check-COVID, that requires systems
to verify claims about COVID-19 from news using evidence from scientific
articles. This approach to fact-checking is particularly challenging as it
requires checking internet text written in everyday language against evidence
from journal articles written in formal academic language. Check-COVID contains
1, 504 expert-annotated news claims about the coronavirus paired with
sentence-level evidence from scientific journal articles and veracity labels.
It includes both extracted (journalist-written) and composed
(annotator-written) claims. Experiments using both a fact-checking specific
system and GPT-3.5, which respectively achieve F1 scores of 76.99 and 69.90 on
this task, reveal the difficulty of automatically fact-checking both claim
types and the importance of in-domain data for good performance. Our data and
models are released publicly at https://github.com/posuer/Check-COVID.Comment: Accepted as ACL 2023 Finding
Recommended from our members
Assessment of exposure to air pollution in children: Determining whether wearing a personal monitor affects physical activity.
Personal air pollution monitoring in research studies should not interfere with usual patterns of behavior and bias results. In an urban pediatric cohort study we tested whether wearing an air monitor impacted activity time based on continuous watch-based accelerometry. The majority (71%) reported that activity while wearing the monitor mimicked normal activity. Correspondingly, variation in activity while wearing versus not wearing the monitor did not differ greatly from baseline variation in activity (P = 0.84)
State-of-the-art methods for exposure-health studies: Results from the exposome data challenge event
The exposome recognizes that individuals are exposed simultaneously to a multitude of different environmental factors and takes a holistic approach to the discovery of etiological factors for disease. However, challenges arise when trying to quantify the health effects of complex exposure mixtures. Analytical challenges include dealing with high dimensionality, studying the combined effects of these exposures and their interactions, integrating causal pathways, and integrating high-throughput omics layers. To tackle these challenges, the Barcelona Institute for Global Health (ISGlobal) held a data challenge event open to researchers from all over the world and from all expertises. Analysts had a chance to compete and apply state-of-the-art methods on a common partially simulated exposome dataset (based on real case data from the HELIX project) with multiple correlated exposure variables (P > 100 exposure variables) arising from general and personal environments at different time points, biological molecular data (multi-omics: DNA methylation, gene expression, proteins, metabolomics) and multiple clinical phenotypes in 1301 mother–child pairs. Most of the methods presented included feature selection or feature reduction to deal with the high dimensionality of the exposome dataset. Several approaches explicitly searched for combined effects of exposures and/or their interactions using linear index models or response surface methods, including Bayesian methods. Other methods dealt with the multi-omics dataset in mediation analyses using multiple-step approaches. Here we discuss features of the statistical models used and provide the data and codes used, so that analysts have examples of implementation and can learn how to use these methods. Overall, the exposome data challenge presented a unique opportunity for researchers from different disciplines to create and share state-of-the-art analytical methods, setting a new standard for open science in the exposome and environmental health field
Additional file 1: Figure S1. of Effect of personal exposure to black carbon on changes in allergic asthma gene methylation measured 5Â days later in urban children: importance of allergic sensitization
Conserved promoter regions. Black lines mark loci that are conserved between human and mouse in the promoter region of IL4, IFNγ, and ARG2. White areas are not conserved. Conserved regions were identified using Standard Nucleotide BLAST (blastn for more dissimilar regions; https://blast.ncbi.nlm.nih.gov/Blast.cgi.) for the 400 nucleotides upstream of the transcriptional start site (TSS) in the human sequence. The NOS2A promoter region under investigation is not conserved between mice and human. Figure S2: Schematic demonstration of collected measures. Numbers in the box represent the number of participants. N:n = number of repeat subjects: number of observations. Grey dotted box indicates two measures (both time 1 and time 2, 6 months apart) available and white box only one measure (Time 1) available. N = 10 participants dropped due to invalid personal or residential air pollution measures. N = 17 participants were further excluded from the analysis due to missing total IgE (N = 16) and invalid DNA methylation due to technical failures in the laboratory (N = 1), resulting in N = 136 of the final sample size. Figure S3: Correlations between day 1 and day 6 buccal cell DNA methylations of (a) IL4 (CpG−326,CpG−48, (b) IFNγ (CpG−186,CpG−54), and (c) NOS2A (CpG+5099, CpG+5106) and (d) ARG2 (average methylation of CpG−32, CpG−30, and CpG−26), Spearman correlation coefficient presented. (DOCX 466 kb