4,816 research outputs found
Controlling Linguistic Style Aspects in Neural Language Generation
Most work on neural natural language generation (NNLG) focus on controlling
the content of the generated text. We experiment with controlling several
stylistic aspects of the generated text, in addition to its content. The method
is based on conditioned RNN language model, where the desired content as well
as the stylistic parameters serve as conditioning contexts. We demonstrate the
approach on the movie reviews domain and show that it is successful in
generating coherent sentences corresponding to the required linguistic style
and content
Evaluating Large Language Models on Controlled Generation Tasks
While recent studies have looked into the abilities of large language models
in various benchmark tasks, including question generation, reading
comprehension, multilingual and etc, there have been few studies looking into
the controllability of large language models on generation tasks. We present an
extensive analysis of various benchmarks including a sentence planning
benchmark with different granularities. After comparing large language models
against state-of-the-start finetuned smaller models, we present a spectrum
showing large language models falling behind, are comparable, or exceed the
ability of smaller models. We conclude that **large language models struggle at
meeting fine-grained hard constraints**.Comment: EMNLP 202
Gamma Sampling: Fine-grained Controlling Language Models without Training
The dominant approaches for controlling language models achieve prominence in
controlling high-level attributes (e.g. topic and sentiment). However, these
methods often require condition-specific data or are computationally expensive.
We propose a new simple guided decoding method, Gamma Sampling, which does not
require any training data to achieve fine-grained controllable text generation
while maintaining a fast generation speed. Gamma Sampling introduces
attribute-related information (provided by humans or language models
themselves) into the sampling process to guide language models to generate
texts with desired attributes. Since no training is involved, Gamma Sampling
can be easily applied to any language model for controllable text generation.
Through experiments, we show that Gamma Sampling-steered GPT2-small (117M)
outperforms baselines such as PPLM (345M) and CTRL (1.6B) in diversity,
attribute relevance, and overall quality of generated samples.Comment: 20 pages, 5 figure
Personalized Recommendation Model: An Online Comment Sentiment Based Analysis
Traditional recommendation algorithms measure users’ online ratings of goods and services but ignore the information contained in written reviews, resulting in lowered personalized recommendation accuracy. Users’ reviews express opinions and reflect implicit preferences and emotions towards the features of products or services. This paper proposes a model for the fine-grained analysis of emotions expressed in users’ online written reviews, using film reviews on the Chinese social networking site Douban.com as an example. The model extracts feature-sentiment word pairs in user reviews according to four syntactic dependencies, examines film features, and scores the sentiment values of film features according to user preferences. User group personalized recommendations are realized through user clustering and user similarity calculation. Experiments show that the extraction of user feature-sentiment word pairs based on four syntactic dependencies can better identify the implicit preferences of users, apply them to recommendations and thereby increase recommendation accuracy
Toward Unified Controllable Text Generation via Regular Expression Instruction
Controllable text generation is a fundamental aspect of natural language
generation, with numerous methods proposed for different constraint types.
However, these approaches often require significant architectural or decoding
modifications, making them challenging to apply to additional constraints or
resolve different constraint combinations. To address this, our paper
introduces Regular Expression Instruction (REI), which utilizes an
instruction-based mechanism to fully exploit regular expressions' advantages to
uniformly model diverse constraints. Specifically, our REI supports all popular
fine-grained controllable generation constraints, i.e., lexical, positional,
and length, as well as their complex combinations, via regular expression-style
instructions. Our method only requires fine-tuning on medium-scale language
models or few-shot, in-context learning on large language models, and requires
no further adjustment when applied to various constraint combinations.
Experiments demonstrate that our straightforward approach yields high success
rates and adaptability to various constraints while maintaining competitiveness
in automatic metrics and outperforming most previous baselines.Comment: Accepted on IJCNLP-AACL 202
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey
Storytelling and narrative are fundamental to human experience, intertwined
with our social and cultural engagement. As such, researchers have long
attempted to create systems that can generate stories automatically. In recent
years, powered by deep learning and massive data resources, automatic story
generation has shown significant advances. However, considerable challenges,
like the need for global coherence in generated stories, still hamper
generative models from reaching the same storytelling ability as human
narrators. To tackle these challenges, many studies seek to inject structured
knowledge into the generation process, which is referred to as structure
knowledge-enhanced story generation. Incorporating external knowledge can
enhance the logical coherence among story events, achieve better knowledge
grounding, and alleviate over-generalization and repetition problems in
stories. This survey provides the latest and comprehensive review of this
research field: (i) we present a systematical taxonomy regarding how existing
methods integrate structured knowledge into story generation; (ii) we summarize
involved story corpora, structured knowledge datasets, and evaluation metrics;
(iii) we give multidimensional insights into the challenges of
knowledge-enhanced story generation and cast light on promising directions for
future study
- …