824 research outputs found
Budget-Aware Adapters for Multi-Domain Learning
Multi-Domain Learning (MDL) refers to the problem of learning a set of models
derived from a common deep architecture, each one specialized to perform a task
in a certain domain (e.g., photos, sketches, paintings). This paper tackles MDL
with a particular interest in obtaining domain-specific models with an
adjustable budget in terms of the number of network parameters and
computational complexity. Our intuition is that, as in real applications the
number of domains and tasks can be very large, an effective MDL approach should
not only focus on accuracy but also on having as few parameters as possible. To
implement this idea we derive specialized deep models for each domain by
adapting a pre-trained architecture but, differently from other methods, we
propose a novel strategy to automatically adjust the computational complexity
of the network. To this aim, we introduce Budget-Aware Adapters that select the
most relevant feature channels to better handle data from a novel domain. Some
constraints on the number of active switches are imposed in order to obtain a
network respecting the desired complexity budget. Experimentally, we show that
our approach leads to recognition accuracy competitive with state-of-the-art
approaches but with much lighter networks both in terms of storage and
computation.Comment: ICCV 201
Domain Adaptation for Novel Imaging Modalities with Application to Prostate MRI
The need for training data can impede the adoption of novel imaging modalities for deep learning-based medical image analysis. Domain adaptation can mitigate this problem by exploiting training samples from an existing, densely-annotated source domain within a novel, sparsely-annotated target domain, by bridging the differences between the two domains. In this thesis we present methods for adapting between diffusion-weighed (DW)-MRI data from multiparametric (mp)-MRI acquisitions and VERDICT (Vascular, Extracellular and Restricted Diffusion for Cytometry in Tumors) MRI, a richer DW-MRI technique involving an optimized acquisition protocol for cancer characterization. We also show that the proposed methods are general and their applicability extends beyond medical imaging.
First, we propose a semi-supervised domain adaptation method for prostate lesion segmentation on VERDICT MRI. Our approach relies on stochastic generative modelling to translate across two heterogeneous domains at pixel-space and exploits the inherent uncertainty in the cross-domain mapping to generate multiple outputs conditioned on a single input. We further extend this approach to the unsupervised scenario where there is no labeled data for the target domain. We rely on stochastic generative modelling to translate across the two domains at pixel space and introduce two loss functions that promote semantic consistency.
Finally we demonstrate that the proposed approaches extend beyond medical image analysis and focus on unsupervised domain adaptation for semantic segmentation of urban scenes. We show that relying on stochastic generative modelling allows us to train more accurate target networks and achieve state-of-the-art performance on two challenging semantic segmentation benchmarks
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey
Large language models (LLMs) have significantly advanced the field of natural
language processing (NLP), providing a highly useful, task-agnostic foundation
for a wide range of applications. However, directly applying LLMs to solve
sophisticated problems in specific domains meets many hurdles, caused by the
heterogeneity of domain data, the sophistication of domain knowledge, the
uniqueness of domain objectives, and the diversity of the constraints (e.g.,
various social norms, cultural conformity, religious beliefs, and ethical
standards in the domain applications). Domain specification techniques are key
to make large language models disruptive in many applications. Specifically, to
solve these hurdles, there has been a notable increase in research and
practices conducted in recent years on the domain specialization of LLMs. This
emerging field of study, with its substantial potential for impact,
necessitates a comprehensive and systematic review to better summarize and
guide ongoing work in this area. In this article, we present a comprehensive
survey on domain specification techniques for large language models, an
emerging direction critical for large language model applications. First, we
propose a systematic taxonomy that categorizes the LLM domain-specialization
techniques based on the accessibility to LLMs and summarizes the framework for
all the subcategories as well as their relations and differences to each other.
Second, we present an extensive taxonomy of critical application domains that
can benefit dramatically from specialized LLMs, discussing their practical
significance and open challenges. Last, we offer our insights into the current
research status and future trends in this area
How Will It Drape Like? Capturing Fabric Mechanics from Depth Images
We propose a method to estimate the mechanical parameters of fabrics using a
casual capture setup with a depth camera. Our approach enables to create
mechanically-correct digital representations of real-world textile materials,
which is a fundamental step for many interactive design and engineering
applications. As opposed to existing capture methods, which typically require
expensive setups, video sequences, or manual intervention, our solution can
capture at scale, is agnostic to the optical appearance of the textile, and
facilitates fabric arrangement by non-expert operators. To this end, we propose
a sim-to-real strategy to train a learning-based framework that can take as
input one or multiple images and outputs a full set of mechanical parameters.
Thanks to carefully designed data augmentation and transfer learning protocols,
our solution generalizes to real images despite being trained only on synthetic
data, hence successfully closing the sim-to-real loop.Key in our work is to
demonstrate that evaluating the regression accuracy based on the similarity at
parameter space leads to an inaccurate distances that do not match the human
perception. To overcome this, we propose a novel metric for fabric drape
similarity that operates on the image domain instead on the parameter space,
allowing us to evaluate our estimation within the context of a similarity rank.
We show that out metric correlates with human judgments about the perception of
drape similarity, and that our model predictions produce perceptually accurate
results compared to the ground truth parameters.Comment: 12 pages, 12 figures. Accepted to EUROGRAPHICS 2023. Project website:
https://carlosrodriguezpardo.es/projects/MechFromDepth
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
We propose a neural language modeling system based on low-rank adaptation
(LoRA) for speech recognition output rescoring. Although pretrained language
models (LMs) like BERT have shown superior performance in second-pass
rescoring, the high computational cost of scaling up the pretraining stage and
adapting the pretrained models to specific domains limit their practical use in
rescoring. Here we present a method based on low-rank decomposition to train a
rescoring BERT model and adapt it to new domains using only a fraction (0.08%)
of the pretrained parameters. These inserted matrices are optimized through a
discriminative training objective along with a correlation-based regularization
loss. The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is
evaluated on LibriSpeech and internal datasets with decreased training times by
factors between 5.4 and 3.6.Comment: Accepted to IEEE ASRU 2023. Internal Review Approved. Revised 2nd
version with Andreas and Huck. The first version is in Sep 29th. 8 page
Domain Generalization in Vision: A Survey
Generalization to out-of-distribution (OOD) data is a capability natural to
humans yet challenging for machines to reproduce. This is because most learning
algorithms strongly rely on the i.i.d.~assumption on source/target data, which
is often violated in practice due to domain shift. Domain generalization (DG)
aims to achieve OOD generalization by using only source data for model
learning. Since first introduced in 2011, research in DG has made great
progresses. In particular, intensive research in this topic has led to a broad
spectrum of methodologies, e.g., those based on domain alignment,
meta-learning, data augmentation, or ensemble learning, just to name a few; and
has covered various vision applications such as object recognition,
segmentation, action recognition, and person re-identification. In this paper,
for the first time a comprehensive literature review is provided to summarize
the developments in DG for computer vision over the past decade. Specifically,
we first cover the background by formally defining DG and relating it to other
research fields like domain adaptation and transfer learning. Second, we
conduct a thorough review into existing methods and present a categorization
based on their methodologies and motivations. Finally, we conclude this survey
with insights and discussions on future research directions.Comment: v4: includes the word "vision" in the title; improves the
organization and clarity in Section 2-3; adds future directions; and mor
- …