Search CORE

204,892 research outputs found

Recommended from our members

Effective and Efficient Transfer Learning in the Era of Large Language Models

Author: Vu Tu
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2023
Field of study

Substantial progress has been made in the field of natural language processing (NLP) due to the advent of large language models (LLMs)—deep neural networks with millions or billions of parameters pre-trained on large amounts of unlabeled data. However, these models have common weaknesses, including degenerate performance in data-scarce scenarios, and substantial computational resource requirements. This thesis aims to develop methods to address these limitations for improved applicability and performance of LLMs in resource-constrained settings with limited data and/or computational resources. To address the need for labeled data in data-scarce scenarios, I present two methods, in Chapter 2 and Chapter 3, respectively. The first method leverages beneficial relationships between NLP tasks for transfer learning, while the second method combines data augmentation and self-training to boost few-shot learning performance—the ability to perform novel tasks from only a few labeled examples. Additionally, in Chapter 4, I introduce a novel parameter-efficient transfer learning approach that reuses a single frozen model for all tasks while only learning minimal task-specific parameters (soft/continuous prompts) to represent tasks and transfer knowledge. Our method can match or outperform fine-tuning task-specific models (training the whole model on each task). In Chapter 5, I demonstrate the benefits of parameter-efficient transfer learning in a cross-lingual transfer setting. Finally, I conclude the thesis in Chapter 6 by outlining potential avenues for future research that aim to advance NLP through large-scale multi-task learning using multilingual and multimodal data

ScholarWorks@UMass Amherst

Curriculum semi-supervised segmentation

Author: Ayed Ismail Ben
Dolz Jose
Granger Eric
Kervadec Hoel
Publication venue
Publication date: 26/07/2019
Field of study

This study investigates a curriculum-style strategy for semi-supervised CNN segmentation, which devises a regression network to learn image-level information such as the size of a target region. These regressions are used to effectively regularize the segmentation network, constraining softmax predictions of the unlabeled images to match the inferred label distributions. Our framework is based on inequality constraints that tolerate uncertainties with inferred knowledge, e.g., regressed region size, and can be employed for a large variety of region attributes. We evaluated our proposed strategy for left ventricle segmentation in magnetic resonance images (MRI), and compared it to standard proposal-based semi-supervision strategies. Our strategy leverages unlabeled data in more efficiently, and achieves very competitive results, approaching the performance of full-supervision.Comment: Accepted as paper as MICCAI 2O1

arXiv.org e-Print Archive