9 research outputs found
PerPLM: Personalized Fine-tuning of Pretrained Language Models via Writer-specific Intermediate Learning and Prompts
The meanings of words and phrases depend not only on where they are used
(contexts) but also on who use them (writers). Pretrained language models
(PLMs) are powerful tools for capturing context, but they are typically
pretrained and fine-tuned for universal use across different writers. This
study aims to improve the accuracy of text understanding tasks by personalizing
the fine-tuning of PLMs for specific writers. We focus on a general setting
where only the plain text from target writers are available for
personalization. To avoid the cost of fine-tuning and storing multiple copies
of PLMs for different users, we exhaustively explore using writer-specific
prompts to personalize a unified PLM. Since the design and evaluation of these
prompts is an underdeveloped area, we introduce and compare different types of
prompts that are possible in our setting. To maximize the potential of
prompt-based personalized fine-tuning, we propose a personalized intermediate
learning based on masked language modeling to extract task-independent traits
of writers' text. Our experiments, using multiple tasks, datasets, and PLMs,
reveal the nature of different prompts and the effectiveness of our
intermediate learning approach.Comment: 11 page
Personalized Text Generation with Fine-Grained Linguistic Control
As the text generation capabilities of large language models become
increasingly prominent, recent studies have focused on controlling particular
aspects of the generated text to make it more personalized. However, most
research on controllable text generation focuses on controlling the content or
modeling specific high-level/coarse-grained attributes that reflect authors'
writing styles, such as formality, domain, or sentiment. In this paper, we
focus on controlling fine-grained attributes spanning multiple linguistic
dimensions, such as lexical and syntactic attributes. We introduce a novel
benchmark to train generative models and evaluate their ability to generate
personalized text based on multiple fine-grained linguistic attributes. We
systematically investigate the performance of various large language models on
our benchmark and draw insights from the factors that impact their performance.
We make our code, data, and pretrained models publicly available
Causally Denoise Word Embeddings Using Half-Sibling Regression
Distributional representations of words, also known as word vectors, have
become crucial for modern natural language processing tasks due to their wide
applications. Recently, a growing body of word vector postprocessing algorithm
has emerged, aiming to render off-the-shelf word vectors even stronger. In line
with these investigations, we introduce a novel word vector postprocessing
scheme under a causal inference framework. Concretely, the postprocessing
pipeline is realized by Half-Sibling Regression (HSR), which allows us to
identify and remove confounding noise contained in word vectors. Compared to
previous work, our proposed method has the advantages of interpretability and
transparency due to its causal inference grounding. Evaluated on a battery of
standard lexical-level evaluation tasks and downstream sentiment analysis
tasks, our method reaches state-of-the-art performance.Comment: Accepted by AAAI 202
Dynamic Contextualized Word Embeddings
Static word embeddings that represent words by a single vector cannot capture the variability of word meaning in different linguistic and extralinguistic contexts. Building on prior work on contextualized and dynamic word embeddings, we introduce dynamic contextualized word embeddings that represent words as a function of both linguistic and extralinguistic context. Based on a pretrained language model (PLM), dynamic contextualized word embeddings model time and social space jointly, which makes them attractive for a range of NLP tasks involving semantic variability. We highlight potential application scenarios by means of qualitative and quantitative analyses on four English datasets
Representación vectorial de relación de hiponimia e hiperonimia en español
Actualmente, gracias a Internet y a la Web se dispone de información casi ilimitada, la cual está representada a nivel de textos en su mayorÃa. AsÃ, dado que acceder a estos textos en su mayorÃa es de libre acceso, nace el interés por su manipulación de una manera automatizada para poder extraer información que se considere relevante. El presente trabajo de investigación se ubica dentro de la detección automática de relaciones léxicas entre palabras, que son relaciones que se establecen entre los significados de las palabras tal como se consigna en el diccionario. En particular, se centra en la detección de relaciones de hiponimia e hiperonimia, debido a que éstas son relaciones de palabras en las que una de ellas engloba el significado de otra o viceversa, lo cual podrÃa considerarse como categorización de palabras. Básicamente, el método propuesto se basa en la manipulación de una representación vectorial de palabras denominado Word Embeddings, para resaltar especialmente áquellas que tengan relación jerárquica, proceso que se realiza a partir de textos no estructurados. Tradicionalmente, los Word Embeddings son utilizados para tareas de analogÃa, es decir, para detectar relaciones de sinonimia, por lo que se considera un poco más complejo utilizar estos vectores para la detección de relaciones jerárquicas (hiperonimia e hiponimia), por consecuencia se proponen métodos adicionales para que, en conjunto con los Word Embeddings, se puedan obtener resultados eficientes al momento de detectar las relaciones entre distintos pares de palabras.Currently, thanks to the Internet and Web, almost unlimited information is available, which is mostly represented at text level. Thus, given that access to these texts is mostly freely available, interest in their manipulation is born in an automated way to extract information that is considered relevant. The present research work is located within the automatic detection of lexical relations between words, which are relations that are established between the meanings of words as it is stated in the dictionary. In particular, it focuses on the detection of hyponymy and hyperonymy relationships, because these are word relationships in which one of them encompasses the meaning of another or vice versa, which could be considered as categorization of words. Basically, the proposed method is based on the manipulation of Word Embeddings to highlight especially words that have a hierarchical relationship, a process that is carried out from unstructured texts. Traditionally, Word Embeddings are used for analogy tasks, that is, to detect synonymy relationships, so it is considered a bit more complex to use these vectors for the hierarchical relationships (hyperonimia and hyponymy) detection, therefore, additional methods are proposed, so in conjunction with the Word Embeddings, efficient results can be obtained when detecting the relationships between different pairs of words.Tesi
Leveraging Longitudinal Data for Personalized Prediction and Word Representations
This thesis focuses on personalization, word representations, and longitudinal dialog. We first look at users expressions of individual preferences. In this targeted sentiment task, we find that we can improve entity extraction and sentiment classification using domain lexicons and linear term weighting. This task is important to personalization and dialog systems, as targets need to be identified in conversation and personal preferences affect how the system should react. Then we examine individuals with large amounts of personal conversational data in order to better predict what people will say. We consider extra-linguistic features that can be used to predict behavior and to predict the relationship between interlocutors. We show that these features improve over just using message content and that training on personal data leads to much better performance than training on a sample from all other users. We look not just at using personal data for these end-tasks, but also constructing personalized word representations. When we have a lot of data for an individual, we create personalized word embeddings that improve performance on language modeling and authorship attribution. When we have limited data, but we have user demographics, we can instead construct demographic word embeddings. We show that these representations improve language modeling and word association performance. When we do not have demographic information, we show that using a small amount of data from an individual, we can calculate similarity to existing users and interpolate or leverage data from these users to improve language modeling performance. Using these types of personalized word representations, we are able to provide insight into what words vary more across users and demographics. The kind of personalized representations that we introduce in this work allow for applications such as predictive typing, style transfer, and dialog systems. Importantly, they also have the potential to enable more equitable language models, with improved performance for those demographic groups that have little representation in the data.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167971/1/cfwelch_1.pd