19 research outputs found
Syntax-Aware Graph-to-Graph Transformer for Semantic Role Labelling
Recent models have shown that incorporating syntactic knowledge into the
semantic role labelling (SRL) task leads to a significant improvement. In this
paper, we propose Syntax-aware Graph-to-Graph Transformer (SynG2G-Tr) model,
which encodes the syntactic structure using a novel way to input graph
relations as embeddings, directly into the self-attention mechanism of
Transformer. This approach adds a soft bias towards attention patterns that
follow the syntactic structure but also allows the model to use this
information to learn alternative patterns. We evaluate our model on both
span-based and dependency-based SRL datasets, and outperform previous
alternative methods in both in-domain and out-of-domain settings, on CoNLL 2005
and CoNLL 2009 datasets.Comment: Accepted to Rep4NLP at ACL 202
Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding
Hallucinations and off-target translation remain unsolved problems in machine
translation, especially for low-resource languages and massively multilingual
models. In this paper, we introduce methods to mitigate both failure cases with
a modified decoding objective, without requiring retraining or external models.
In source-contrastive decoding, we search for a translation that is probable
given the correct input, but improbable given a random input segment,
hypothesising that hallucinations will be similarly probable given either. In
language-contrastive decoding, we search for a translation that is probable,
but improbable given the wrong language indicator token. In experiments on
M2M-100 (418M) and SMaLL-100, we find that these methods effectively suppress
hallucinations and off-target translations, improving chrF2 by 1.7 and 1.4
points on average across 57 tested translation directions. In a proof of
concept on English--German, we also show that we can suppress off-target
translations with the Llama 2 chat models, demonstrating the applicability of
the method to machine translation with LLMs. We release our source code at
https://github.com/ZurichNLP/ContraDecode
Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models
Massively multilingual machine translation models allow for the translation
of a large number of languages with a single model, but have limited
performance on low- and very-low-resource translation directions. Pivoting via
high-resource languages remains a strong strategy for low-resource directions,
and in this paper we revisit ways of pivoting through multiple languages.
Previous work has used a simple averaging of probability distributions from
multiple paths, but we find that this performs worse than using a single pivot,
and exacerbates the hallucination problem because the same hallucinations can
be probable across different paths. As an alternative, we propose MaxEns, a
combination strategy that is biased towards the most confident predictions,
hypothesising that confident predictions are less prone to be hallucinations.
We evaluate different strategies on the FLORES benchmark for 20 low-resource
language directions, demonstrating that MaxEns improves translation quality for
low-resource languages while reducing hallucination in translations, compared
to both direct translation and an averaging approach. On average, multi-pivot
strategies still lag behind using English as a single pivot language, raising
the question of how to identify the best pivoting strategy for a given
translation direction
Transformers as Graph-to-Graph Models
We argue that Transformers are essentially graph-to-graph models, with
sequences just being a special case. Attention weights are functionally
equivalent to graph edges. Our Graph-to-Graph Transformer architecture makes
this ability explicit, by inputting graph edges into the attention weight
computations and predicting graph edges with attention-like functions, thereby
integrating explicit graphs into the latent graphs learned by pretrained
Transformers. Adding iterative graph refinement provides a joint embedding of
input, output, and latent graphs, allowing non-autoregressive graph prediction
to optimise the complete graph without any bespoke pipeline or decoding
strategy. Empirical results show that this architecture achieves
state-of-the-art accuracies for modelling a variety of linguistic structures,
integrating very effectively with the latent linguistic representations learned
by pretraining.Comment: Accepted to Big Picture workshop at EMNLP 202
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages
In recent years, multilingual machine translation models have achieved
promising performance on low-resource language pairs by sharing information
between similar languages, thus enabling zero-shot translation. To overcome the
"curse of multilinguality", these models often opt for scaling up the number of
parameters, which makes their use in resource-constrained environments
challenging. We introduce SMaLL-100, a distilled version of the M2M-100 (12B)
model, a massively multilingual machine translation model covering 100
languages. We train SMaLL-100 with uniform sampling across all language pairs
and therefore focus on preserving the performance of low-resource languages. We
evaluate SMaLL-100 on different low-resource benchmarks: FLORES-101, Tatoeba,
and TICO-19 and demonstrate that it outperforms previous massively multilingual
models of comparable sizes (200-600M) while improving inference latency and
memory usage. Additionally, our model achieves comparable results to M2M-100
(1.2B), while being 3.6x smaller and 4.3x faster at inference. Code and
pre-trained models: https://github.com/alirezamshi/small100Comment: Accepted to EMNLP 202
The DCU-EPFL enhanced dependency parser at the IWPT 2021 shared task
We describe the DCU-EPFL submission to the IWPT 2021 Parsing Shared Task: From Raw Text to Enhanced Universal Dependencies. The task involves parsing Enhanced UD graphs, which are an extension of the basic dependency trees designed to be more facilitative towards representing semantic structure. Evaluation is carried out on 29 treebanks in 17 languages and participants are required to parse the data from each language starting from raw strings. Our approach uses the Stanza pipeline to preprocess the text files, XLM-RoBERTa to obtain contextualized token representations, and an edge-scoring and labeling model to predict the enhanced graph. Finally, we run a postprocessing script to ensure all of our outputs are valid Enhanced UD graphs. Our system places 6th out of 9 participants with a coarse Enhanced Labeled Attachment Score (ELAS) of 83.57. We carry out additional post-deadline experiments which include using Trankit for pre-processing, XLM-RoBERTa LARGE, treebank concatenation, and multitask learning between a basic and an enhanced dependency parser. All of these modifications improve our initial score and our final system has a coarse ELAS of 88.04
A Scoping Review of Components of Physician-induced Demand for Designing a Conceptual Framework
Objectives The current study presents a new conceptual framework for physician-induced demand that comprises several influential components and their interactions. Methods This framework was developed on the basis of the conceptual model proposed by Labelle. To identify the components that influenced induced demand and their interactions, a scoping review was conducted (from January 1980 to January 2017). Additionally, an expert panel was formed to formulate and expand the framework. Results The developed framework comprises 2 main sets of components. First, the supply side includes 9 components: physicians’ incentive for pecuniary profit or meeting their target income, physicians’ current income, the physician/population ratio, service price (tariff), payment method, consultation time, type of employment of physicians, observable characteristics of the physician, and type and size of the hospital. Second, the demand side includes 3 components: patients’ observable characteristics, patients’ non-clinical characteristics, and insurance coverage. Conclusions A conceptual framework that can clearly describe interactions between the components that influence induced demand is a critical step in providing a scientific basis for understanding physicians’ behavior, particularly in the field of health economics
Recommended from our members
Global investments in pandemic preparedness and COVID-19: development assistance and domestic spending on health between 1990 and 2026
Background
The COVID-19 pandemic highlighted gaps in health surveillance systems, disease prevention, and treatment globally. Among the many factors that might have led to these gaps is the issue of the financing of national health systems, especially in low-income and middle-income countries (LMICs), as well as a robust global system for pandemic preparedness. We aimed to provide a comparative assessment of global health spending at the onset of the pandemic; characterise the amount of development assistance for pandemic preparedness and response disbursed in the first 2 years of the COVID-19 pandemic; and examine expectations for future health spending and put into context the expected need for investment in pandemic preparedness.
Methods
In this analysis of global health spending between 1990 and 2021, and prediction from 2021 to 2026, we estimated four sources of health spending: development assistance for health (DAH), government spending, out-of-pocket spending, and prepaid private spending across 204 countries and territories. We used the Organisation for Economic Co-operation and Development (OECD)'s Creditor Reporting System (CRS) and the WHO Global Health Expenditure Database (GHED) to estimate spending. We estimated development assistance for general health, COVID-19 response, and pandemic preparedness and response using a keyword search. Health spending estimates were combined with estimates of resources needed for pandemic prevention and preparedness to analyse future health spending patterns, relative to need.
Findings
In 2019, at the onset of the COVID-19 pandemic, US7·3 trillion (95% UI 7·2–7·4) in 2019; 293·7 times the 43·1 billion in development assistance was provided to maintain or improve health. The pandemic led to an unprecedented increase in development assistance targeted towards health; in 2020 and 2021, 37·8 billion was provided for the health-related COVID-19 response. Although the support for pandemic preparedness is 12·2% of the recommended target by the High-Level Independent Panel (HLIP), the support provided for the health-related COVID-19 response is 252·2% of the recommended target. Additionally, projected spending estimates suggest that between 2022 and 2026, governments in 17 (95% UI 11–21) of the 137 LMICs will observe an increase in national government health spending equivalent to an addition of 1% of GDP, as recommended by the HLIP.
Interpretation
There was an unprecedented scale-up in DAH in 2020 and 2021. We have a unique opportunity at this time to sustain funding for crucial global health functions, including pandemic preparedness. However, historical patterns of underfunding of pandemic preparedness suggest that deliberate effort must be made to ensure funding is maintained
Global, regional, and national burden of disorders affecting the nervous system, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
BackgroundDisorders affecting the nervous system are diverse and include neurodevelopmental disorders, late-life neurodegeneration, and newly emergent conditions, such as cognitive impairment following COVID-19. Previous publications from the Global Burden of Disease, Injuries, and Risk Factor Study estimated the burden of 15 neurological conditions in 2015 and 2016, but these analyses did not include neurodevelopmental disorders, as defined by the International Classification of Diseases (ICD)-11, or a subset of cases of congenital, neonatal, and infectious conditions that cause neurological damage. Here, we estimate nervous system health loss caused by 37 unique conditions and their associated risk factors globally, regionally, and nationally from 1990 to 2021.MethodsWe estimated mortality, prevalence, years lived with disability (YLDs), years of life lost (YLLs), and disability-adjusted life-years (DALYs), with corresponding 95% uncertainty intervals (UIs), by age and sex in 204 countries and territories, from 1990 to 2021. We included morbidity and deaths due to neurological conditions, for which health loss is directly due to damage to the CNS or peripheral nervous system. We also isolated neurological health loss from conditions for which nervous system morbidity is a consequence, but not the primary feature, including a subset of congenital conditions (ie, chromosomal anomalies and congenital birth defects), neonatal conditions (ie, jaundice, preterm birth, and sepsis), infectious diseases (ie, COVID-19, cystic echinococcosis, malaria, syphilis, and Zika virus disease), and diabetic neuropathy. By conducting a sequela-level analysis of the health outcomes for these conditions, only cases where nervous system damage occurred were included, and YLDs were recalculated to isolate the non-fatal burden directly attributable to nervous system health loss. A comorbidity correction was used to calculate total prevalence of all conditions that affect the nervous system combined.FindingsGlobally, the 37 conditions affecting the nervous system were collectively ranked as the leading group cause of DALYs in 2021 (443 million, 95% UI 378–521), affecting 3·40 billion (3·20–3·62) individuals (43·1%, 40·5–45·9 of the global population); global DALY counts attributed to these conditions increased by 18·2% (8·7–26·7) between 1990 and 2021. Age-standardised rates of deaths per 100 000 people attributed to these conditions decreased from 1990 to 2021 by 33·6% (27·6–38·8), and age-standardised rates of DALYs attributed to these conditions decreased by 27·0% (21·5–32·4). Age-standardised prevalence was almost stable, with a change of 1·5% (0·7–2·4). The ten conditions with the highest age-standardised DALYs in 2021 were stroke, neonatal encephalopathy, migraine, Alzheimer's disease and other dementias, diabetic neuropathy, meningitis, epilepsy, neurological complications due to preterm birth, autism spectrum disorder, and nervous system cancer.InterpretationAs the leading cause of overall disease burden in the world, with increasing global DALY counts, effective prevention, treatment, and rehabilitation strategies for disorders affecting the nervous system are needed