12 research outputs found

    Transfer Learning for Structured Pruning under Limited Task Data

    Full text link
    Large, pre-trained models are problematic to use in resource constrained applications. Fortunately, task-aware structured pruning methods offer a solution. These approaches reduce model size by dropping structural units like layers and attention heads in a manner that takes into account the end-task. However, these pruning algorithms require more task-specific data than is typically available. We propose a framework which combines structured pruning with transfer learning to reduce the need for task-specific data. Our empirical results answer questions such as: How should the two tasks be coupled? What parameters should be transferred? And, when during training should transfer learning be introduced? Leveraging these insights, we demonstrate that our framework results in pruned models with improved generalization over strong baselines.Comment: 8 pages, 7 figures and 3 table

    Ghana’s Health Policy: Human Resources and Health Outcomes Inequality in Northern and Southern Ghana

    Get PDF
    Despite evidence of widening geographical inequalities in maternal and child health (MCH) coverage and outcomes between the Upper West region (UWR) in the north and the Ashanti region (AR) and Greater Accra region (GAR) in southern Ghana, the relative importance of the underlying social determinants remains unexplored. Policy to reduce MCH inequalities is therefore missing important checks on likely effectiveness. One possibility explored in this thesis based on evidence from national MCH surveys and qualitative studies is that differential access to skilled MCH Providers is an important explanation and a matter for policy attention. Using convergent mixed methods research design, this study assessed whether in Ghana‘s context specifically, increased geographical access to life-course high-impact MCH interventions by primary health care level skilled MCH Providers might contribute more significantly and more immediately to reduction in maternal and neonatal mortality inequalities. Thus, policies to improve, for example, education, income and occupation seen as appropriate measures in other national contexts contribute less. Studies elsewhere support this thesis: maternal and neonatal mortalities responded best to increases in availability of trained service providers. The study throws light on how informed investment in innovative, local-context HRH policy interventions in MCH resource-poor and rural locations could reduce Ghana‘s geographical health inequalities. The findings suggest narrowing neonatal and institutional maternal mortality inequalities more in response to increased geographical accessibility, utilization and coverage of skilled MCH Provider services in UWR than mother‘s education, income and occupation. UWR‘s own recent skilled MCH Providers attraction and retention interventions; and decentralized integrated midwifery/nursing training national policy narrowed the perennial doctor and midwife density gaps between UWR and the AR and GAR. Thus, with evidence-based accelerated state investment in properly decentralized HRH functions and budget, infrastructure and social amenities in UWR (and sister unattractive regions), universal health coverage and sustainable MCH inequalities reduction appear attainable in Ghana

    DeMuX: Data-efficient Multilingual Learning

    Full text link
    We consider the task of optimally fine-tuning pre-trained multilingual models, given small amounts of unlabelled target data and an annotation budget. In this paper, we introduce DEMUX, a framework that prescribes the exact data-points to label from vast amounts of unlabelled multilingual data, having unknown degrees of overlap with the target set. Unlike most prior works, our end-to-end framework is language-agnostic, accounts for model representations, and supports multilingual target configurations. Our active learning strategies rely upon distance and uncertainty measures to select task-specific neighbors that are most informative to label, given a model. DeMuX outperforms strong baselines in 84% of the test cases, in the zero-shot setting of disjoint source and target language sets (including multilingual target pools), across three models and four tasks. Notably, in low-budget settings (5-100 examples), we observe gains of up to 8-11 F1 points for token-level tasks, and 2-5 F1 for complex tasks. Our code is released here: https://github.com/simran-khanuja/demux

    Weakly supervised classification in high energy physics

    No full text

    Weakly supervised classification in high energy physics

    No full text
    Abstract As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics — quark versus gluon tagging — we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available

    AANG: Automating Auxiliary Learning

    Full text link
    When faced with data-starved or highly complex end-tasks, it is commonplace for machine learning practitioners to introduce auxiliary objectives as supplementary learning signals. Whilst much work has been done to formulate useful auxiliary objectives, their construction is still an art which proceeds by slow and tedious hand-design. Intuitions about how and when these objectives improve end-task performance have also had limited theoretical backing. In this work, we present an approach for automatically generating a suite of auxiliary objectives. We achieve this by deconstructing existing objectives within a novel unified taxonomy, identifying connections between them, and generating new ones based on the uncovered structure. Next, we theoretically formalize widely-held intuitions about how auxiliary learning improves generalization of the end-task. This leads us to a principled and efficient algorithm for searching the space of generated objectives to find those most useful to a specified end-task. With natural language processing (NLP) as our domain of study, we empirically verify that our automated auxiliary learning pipeline leads to strong improvements over competitive baselines across continued training experiments on a pre-trained model on 5 NLP end-tasks.Comment: 18 pages, 9 tables and 4 figure
    corecore