6,087 research outputs found
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Transfer learning, where a model is first pre-trained on a data-rich task
before being fine-tuned on a downstream task, has emerged as a powerful
technique in natural language processing (NLP). The effectiveness of transfer
learning has given rise to a diversity of approaches, methodology, and
practice. In this paper, we explore the landscape of transfer learning
techniques for NLP by introducing a unified framework that converts all
text-based language problems into a text-to-text format. Our systematic study
compares pre-training objectives, architectures, unlabeled data sets, transfer
approaches, and other factors on dozens of language understanding tasks. By
combining the insights from our exploration with scale and our new ``Colossal
Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks
covering summarization, question answering, text classification, and more. To
facilitate future work on transfer learning for NLP, we release our data set,
pre-trained models, and code.Comment: Final version as published in JML
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
In this paper, we present a two stage model for multi-hop question answering.
The first stage is a hierarchical graph network, which is used to reason over
multi-hop question and is capable to capture different levels of granularity
using the nature structure(i.e., paragraphs, questions, sentences and entities)
of documents. The reasoning process is convert to node classify task(i.e.,
paragraph nodes and sentences nodes). The second stage is a language model
fine-tuning task. In a word, stage one use graph neural network to select and
concatenate support sentences as one paragraph, and stage two find the answer
span in language model fine-tuning paradigm.Comment: the experience result is not as good as I excep
Uni3D: Exploring Unified 3D Representation at Scale
Scaling up representations for images or text has been extensively
investigated in the past few years and has led to revolutions in learning
vision and language. However, scalable representation for 3D objects and scenes
is relatively unexplored. In this work, we present Uni3D, a 3D foundation model
to explore the unified 3D representation at scale. Uni3D uses a 2D initialized
ViT end-to-end pretrained to align the 3D point cloud features with the
image-text aligned features. Via the simple architecture and pretext task,
Uni3D can leverage abundant 2D pretrained models as initialization and
image-text aligned models as the target, unlocking the great potential of 2D
models and scaling-up strategies to the 3D world. We efficiently scale up Uni3D
to one billion parameters, and set new records on a broad range of 3D tasks,
such as zero-shot classification, few-shot classification, open-world
understanding and part segmentation. We show that the strong Uni3D
representation also enables applications such as 3D painting and retrieval in
the wild. We believe that Uni3D provides a new direction for exploring both
scaling up and efficiency of the representation in 3D domain.Comment: Code and Demo: https://github.com/baaivision/Uni3
- …