12 research outputs found
Transfer Learning for Structured Pruning under Limited Task Data
Large, pre-trained models are problematic to use in resource constrained
applications. Fortunately, task-aware structured pruning methods offer a
solution. These approaches reduce model size by dropping structural units like
layers and attention heads in a manner that takes into account the end-task.
However, these pruning algorithms require more task-specific data than is
typically available. We propose a framework which combines structured pruning
with transfer learning to reduce the need for task-specific data. Our empirical
results answer questions such as: How should the two tasks be coupled? What
parameters should be transferred? And, when during training should transfer
learning be introduced? Leveraging these insights, we demonstrate that our
framework results in pruned models with improved generalization over strong
baselines.Comment: 8 pages, 7 figures and 3 table
Ghana’s Health Policy: Human Resources and Health Outcomes Inequality in Northern and Southern Ghana
Despite evidence of widening geographical inequalities in maternal and child health (MCH) coverage and outcomes between the Upper West region (UWR) in the north and the Ashanti region (AR) and Greater Accra region (GAR) in southern Ghana, the relative importance of the underlying social determinants remains unexplored. Policy to reduce MCH inequalities is therefore missing important checks on likely effectiveness. One possibility explored in this thesis based on evidence from national MCH surveys and qualitative studies is that differential access to skilled MCH Providers is an important explanation and a matter for policy attention. Using convergent mixed methods research design, this study assessed whether in Ghana‘s context specifically, increased geographical access to life-course high-impact MCH interventions by primary health care level skilled MCH Providers might contribute more significantly and more immediately to reduction in maternal and neonatal mortality inequalities. Thus, policies to improve, for example, education, income and occupation seen as appropriate measures in other national contexts contribute less. Studies elsewhere support this thesis: maternal and neonatal mortalities responded best to increases in availability of trained service providers. The study throws light on how informed investment in innovative, local-context HRH policy interventions in MCH resource-poor and rural locations could reduce Ghana‘s geographical health inequalities. The findings suggest narrowing neonatal and institutional maternal mortality inequalities more in response to increased geographical accessibility, utilization and coverage of skilled MCH Provider services in UWR than mother‘s education, income and occupation. UWR‘s own recent skilled MCH Providers attraction and retention interventions; and decentralized integrated midwifery/nursing training national policy narrowed the perennial doctor and midwife density gaps between UWR and the AR and GAR. Thus, with evidence-based accelerated state investment in properly decentralized HRH functions and budget, infrastructure and social amenities in UWR (and sister unattractive regions), universal health coverage and sustainable MCH inequalities reduction appear attainable in Ghana
DeMuX: Data-efficient Multilingual Learning
We consider the task of optimally fine-tuning pre-trained multilingual
models, given small amounts of unlabelled target data and an annotation budget.
In this paper, we introduce DEMUX, a framework that prescribes the exact
data-points to label from vast amounts of unlabelled multilingual data, having
unknown degrees of overlap with the target set. Unlike most prior works, our
end-to-end framework is language-agnostic, accounts for model representations,
and supports multilingual target configurations. Our active learning strategies
rely upon distance and uncertainty measures to select task-specific neighbors
that are most informative to label, given a model. DeMuX outperforms strong
baselines in 84% of the test cases, in the zero-shot setting of disjoint source
and target language sets (including multilingual target pools), across three
models and four tasks. Notably, in low-budget settings (5-100 examples), we
observe gains of up to 8-11 F1 points for token-level tasks, and 2-5 F1 for
complex tasks. Our code is released here:
https://github.com/simran-khanuja/demux
Weakly supervised classification in high energy physics
Abstract As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics — quark versus gluon tagging — we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available
AANG: Automating Auxiliary Learning
When faced with data-starved or highly complex end-tasks, it is commonplace
for machine learning practitioners to introduce auxiliary objectives as
supplementary learning signals. Whilst much work has been done to formulate
useful auxiliary objectives, their construction is still an art which proceeds
by slow and tedious hand-design. Intuitions about how and when these objectives
improve end-task performance have also had limited theoretical backing. In this
work, we present an approach for automatically generating a suite of auxiliary
objectives. We achieve this by deconstructing existing objectives within a
novel unified taxonomy, identifying connections between them, and generating
new ones based on the uncovered structure. Next, we theoretically formalize
widely-held intuitions about how auxiliary learning improves generalization of
the end-task. This leads us to a principled and efficient algorithm for
searching the space of generated objectives to find those most useful to a
specified end-task. With natural language processing (NLP) as our domain of
study, we empirically verify that our automated auxiliary learning pipeline
leads to strong improvements over competitive baselines across continued
training experiments on a pre-trained model on 5 NLP end-tasks.Comment: 18 pages, 9 tables and 4 figure