248 research outputs found
Gamma Sampling: Fine-grained Controlling Language Models without Training
The dominant approaches for controlling language models achieve prominence in
controlling high-level attributes (e.g. topic and sentiment). However, these
methods often require condition-specific data or are computationally expensive.
We propose a new simple guided decoding method, Gamma Sampling, which does not
require any training data to achieve fine-grained controllable text generation
while maintaining a fast generation speed. Gamma Sampling introduces
attribute-related information (provided by humans or language models
themselves) into the sampling process to guide language models to generate
texts with desired attributes. Since no training is involved, Gamma Sampling
can be easily applied to any language model for controllable text generation.
Through experiments, we show that Gamma Sampling-steered GPT2-small (117M)
outperforms baselines such as PPLM (345M) and CTRL (1.6B) in diversity,
attribute relevance, and overall quality of generated samples.Comment: 20 pages, 5 figure
Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music Generation Task
Benefiting from large-scale datasets and pre-trained models, the field of
generative models has recently gained significant momentum. However, most
datasets for symbolic music are very small, which potentially limits the
performance of data-driven multimodal models. An intuitive solution to this
problem is to leverage pre-trained models from other modalities (e.g., natural
language) to improve the performance of symbolic music-related multimodal
tasks. In this paper, we carry out the first study of generating complete and
semantically consistent symbolic music scores from text descriptions, and
explore the efficacy of using publicly available checkpoints (i.e., BERT,
GPT-2, and BART) for natural language processing in the task of text-to-music
generation. Our experimental results show that the improvement from using
pre-trained checkpoints is statistically significant in terms of BLEU score and
edit distance similarity. We analyse the capabilities and limitations of our
model to better understand the potential of language-music models.Comment: 5 pages, 2 figures, 2 table
- …