204,892 research outputs found
Recommended from our members
Effective and Efficient Transfer Learning in the Era of Large Language Models
Substantial progress has been made in the field of natural language processing (NLP) due to the advent of large language models (LLMs)—deep neural networks with millions or billions of parameters pre-trained on large amounts of unlabeled data. However, these models have common weaknesses, including degenerate performance in data-scarce scenarios, and substantial computational resource requirements. This thesis aims to develop methods to address these limitations for improved applicability and performance of LLMs in resource-constrained settings with limited data and/or computational resources.
To address the need for labeled data in data-scarce scenarios, I present two methods, in Chapter 2 and Chapter 3, respectively. The first method leverages beneficial relationships between NLP tasks for transfer learning, while the second method combines data augmentation and self-training to boost few-shot learning performance—the ability to perform novel tasks from only a few labeled examples. Additionally, in Chapter 4, I introduce a novel parameter-efficient transfer learning approach that reuses a single frozen model for all tasks while only learning minimal task-specific parameters (soft/continuous prompts) to represent tasks and transfer knowledge. Our method can match or outperform fine-tuning task-specific models (training the whole model on each task). In Chapter 5, I demonstrate the benefits of parameter-efficient transfer learning in a cross-lingual transfer setting. Finally, I conclude the thesis in Chapter 6 by outlining potential avenues for future research that aim to advance NLP through large-scale multi-task learning using multilingual and multimodal data
Curriculum semi-supervised segmentation
This study investigates a curriculum-style strategy for semi-supervised CNN
segmentation, which devises a regression network to learn image-level
information such as the size of a target region. These regressions are used to
effectively regularize the segmentation network, constraining softmax
predictions of the unlabeled images to match the inferred label distributions.
Our framework is based on inequality constraints that tolerate uncertainties
with inferred knowledge, e.g., regressed region size, and can be employed for a
large variety of region attributes. We evaluated our proposed strategy for left
ventricle segmentation in magnetic resonance images (MRI), and compared it to
standard proposal-based semi-supervision strategies. Our strategy leverages
unlabeled data in more efficiently, and achieves very competitive results,
approaching the performance of full-supervision.Comment: Accepted as paper as MICCAI 2O1
- …