35 research outputs found
Supervised and unsupervised methods for learning representations of linguistic units
Word representations, also called word embeddings, are generic representations, often high-dimensional vectors. They map the discrete space of words into a continuous vector space, which allows us to handle rare or even unseen events, e.g. by considering the nearest neighbors. Many Natural Language Processing tasks can be improved by word representations if we extend the task specific training data by the general knowledge incorporated in the word representations.
The first publication investigates a supervised, graph-based method to create word representations. This method leads to a graph-theoretic similarity measure, CoSimRank, with equivalent formalizations that show CoSimRankâs close relationship to Personalized Page-Rank and SimRank. The new formalization is efficient because it can use the graph-based word representation to compute a single node similarity without having to compute the similarities of the entire graph. We also show how we can take advantage of fast matrix multiplication algorithms.
In the second publication, we use existing unsupervised methods for word representation learning and combine these with semantic resources by learning representations for non-word objects like synsets and entities. We also investigate improved word representations which incorporate the semantic information from the resource. The method is flexible in that it can take any word representations as input and does not need an additional training corpus. A sparse tensor formalization guarantees efficiency and parallelizability.
In the third publication, we introduce a method that learns an orthogonal transformation of the word representation space that focuses the information relevant for a task in an ultradense subspace of a dimensionality that is smaller by a factor of 100 than the original space. We use ultradense representations for a Lexicon Creation task in which words are annotated with three types of lexical information â sentiment, concreteness and frequency.
The final publication introduces a new calculus for the interpretable ultradense subspaces, including polarity, concreteness, frequency and part-of-speech (POS). The calculus supports operations like ââ1 Ă hate = loveâ and âgive me a neutral word for greasyâ (i.e., oleaginous) and extends existing analogy computations like âking â man + woman = queenâ.WortreprĂ€sentationen, sogenannte Word Embeddings, sind generische ReprĂ€sentationen, meist hochdimensionale Vektoren. Sie bilden den diskreten Raum der Wörter in einen stetigen Vektorraum ab und erlauben uns, seltene oder ungesehene Ereignisse zu behandeln -- zum Beispiel durch die Betrachtung der nĂ€chsten Nachbarn. Viele Probleme der Computerlinguistik können durch WortreprĂ€sentationen gelöst werden, indem wir spezifische Trainingsdaten um die allgemeinen Informationen erweitern, welche in den WortreprĂ€sentationen enthalten sind.
In der ersten Publikation untersuchen wir ĂŒberwachte, graphenbasierte Methodenn um WortreprĂ€sentationen zu erzeugen. Diese Methoden fĂŒhren zu einem graphenbasierten ĂhnlichkeitsmaĂ, CoSimRank, fĂŒr welches zwei Ă€quivalente Formulierungen existieren, die sowohl die enge Beziehung zum personalisierten PageRank als auch zum SimRank zeigen. Die neue Formulierung kann einzelne KnotenĂ€hnlichkeiten effektiv berechnen, da graphenbasierte WortreprĂ€sentationen benutzt werden können.
In der zweiten Publikation verwenden wir existierende WortreprĂ€sentationen und kombinieren diese mit semantischen Ressourcen, indem wir ReprĂ€sentationen fĂŒr Objekte lernen, welche keine Wörter sind, wie zum Beispiel Synsets und EntitĂ€ten. Die FlexibilitĂ€t unserer Methode zeichnet sich dadurch aus, dass wir beliebige WortreprĂ€sentationen als Eingabe verwenden können und keinen zusĂ€tzlichen Trainingskorpus benötigen.
In der dritten Publikation stellen wir eine Methode vor, die eine Orthogonaltransformation des Vektorraums der WortreprĂ€sentationen lernt. Diese Transformation fokussiert relevante Informationen in einen ultra-kompakten Untervektorraum. Wir benutzen die ultra-kompakten ReprĂ€sentationen zur Erstellung von WörterbĂŒchern mit drei verschiedene Angaben -- Stimmung, Konkretheit und HĂ€ufigkeit.
Die letzte Publikation prĂ€sentiert eine neue Rechenmethode fĂŒr die interpretierbaren ultra-kompakten UntervektorrĂ€ume -- Stimmung, Konkretheit, HĂ€ufigkeit und Wortart. Diese Rechenmethode beinhaltet Operationen wie ââ1 Ă Hass = Liebeâ und âneutrales Wort fĂŒr Winkeladvokatâ (d.h., Anwalt) und erweitert existierende Rechenmethoden, wie âOnkel â Mann + Frau = Tanteâ
Learning to Attend, Copy, and Generate for Session-Based Query Suggestion
Users try to articulate their complex information needs during search
sessions by reformulating their queries. To make this process more effective,
search engines provide related queries to help users in specifying the
information need in their search process. In this paper, we propose a
customized sequence-to-sequence model for session-based query suggestion. In
our model, we employ a query-aware attention mechanism to capture the structure
of the session context. is enables us to control the scope of the session from
which we infer the suggested next query, which helps not only handle the noisy
data but also automatically detect session boundaries. Furthermore, we observe
that, based on the user query reformulation behavior, within a single session a
large portion of query terms is retained from the previously submitted queries
and consists of mostly infrequent or unseen terms that are usually not included
in the vocabulary. We therefore empower the decoder of our model to access the
source words from the session context during decoding by incorporating a copy
mechanism. Moreover, we propose evaluation metrics to assess the quality of the
generative models for query suggestion. We conduct an extensive set of
experiments and analysis. e results suggest that our model outperforms the
baselines both in terms of the generating queries and scoring candidate queries
for the task of query suggestion.Comment: Accepted to be published at The 26th ACM International Conference on
Information and Knowledge Management (CIKM2017
Supervised and unsupervised methods for learning representations of linguistic units
Word representations, also called word embeddings, are generic representations, often high-dimensional vectors. They map the discrete space of words into a continuous vector space, which allows us to handle rare or even unseen events, e.g. by considering the nearest neighbors. Many Natural Language Processing tasks can be improved by word representations if we extend the task specific training data by the general knowledge incorporated in the word representations.
The first publication investigates a supervised, graph-based method to create word representations. This method leads to a graph-theoretic similarity measure, CoSimRank, with equivalent formalizations that show CoSimRankâs close relationship to Personalized Page-Rank and SimRank. The new formalization is efficient because it can use the graph-based word representation to compute a single node similarity without having to compute the similarities of the entire graph. We also show how we can take advantage of fast matrix multiplication algorithms.
In the second publication, we use existing unsupervised methods for word representation learning and combine these with semantic resources by learning representations for non-word objects like synsets and entities. We also investigate improved word representations which incorporate the semantic information from the resource. The method is flexible in that it can take any word representations as input and does not need an additional training corpus. A sparse tensor formalization guarantees efficiency and parallelizability.
In the third publication, we introduce a method that learns an orthogonal transformation of the word representation space that focuses the information relevant for a task in an ultradense subspace of a dimensionality that is smaller by a factor of 100 than the original space. We use ultradense representations for a Lexicon Creation task in which words are annotated with three types of lexical information â sentiment, concreteness and frequency.
The final publication introduces a new calculus for the interpretable ultradense subspaces, including polarity, concreteness, frequency and part-of-speech (POS). The calculus supports operations like ââ1 Ă hate = loveâ and âgive me a neutral word for greasyâ (i.e., oleaginous) and extends existing analogy computations like âking â man + woman = queenâ.WortreprĂ€sentationen, sogenannte Word Embeddings, sind generische ReprĂ€sentationen, meist hochdimensionale Vektoren. Sie bilden den diskreten Raum der Wörter in einen stetigen Vektorraum ab und erlauben uns, seltene oder ungesehene Ereignisse zu behandeln -- zum Beispiel durch die Betrachtung der nĂ€chsten Nachbarn. Viele Probleme der Computerlinguistik können durch WortreprĂ€sentationen gelöst werden, indem wir spezifische Trainingsdaten um die allgemeinen Informationen erweitern, welche in den WortreprĂ€sentationen enthalten sind.
In der ersten Publikation untersuchen wir ĂŒberwachte, graphenbasierte Methodenn um WortreprĂ€sentationen zu erzeugen. Diese Methoden fĂŒhren zu einem graphenbasierten ĂhnlichkeitsmaĂ, CoSimRank, fĂŒr welches zwei Ă€quivalente Formulierungen existieren, die sowohl die enge Beziehung zum personalisierten PageRank als auch zum SimRank zeigen. Die neue Formulierung kann einzelne KnotenĂ€hnlichkeiten effektiv berechnen, da graphenbasierte WortreprĂ€sentationen benutzt werden können.
In der zweiten Publikation verwenden wir existierende WortreprĂ€sentationen und kombinieren diese mit semantischen Ressourcen, indem wir ReprĂ€sentationen fĂŒr Objekte lernen, welche keine Wörter sind, wie zum Beispiel Synsets und EntitĂ€ten. Die FlexibilitĂ€t unserer Methode zeichnet sich dadurch aus, dass wir beliebige WortreprĂ€sentationen als Eingabe verwenden können und keinen zusĂ€tzlichen Trainingskorpus benötigen.
In der dritten Publikation stellen wir eine Methode vor, die eine Orthogonaltransformation des Vektorraums der WortreprĂ€sentationen lernt. Diese Transformation fokussiert relevante Informationen in einen ultra-kompakten Untervektorraum. Wir benutzen die ultra-kompakten ReprĂ€sentationen zur Erstellung von WörterbĂŒchern mit drei verschiedene Angaben -- Stimmung, Konkretheit und HĂ€ufigkeit.
Die letzte Publikation prĂ€sentiert eine neue Rechenmethode fĂŒr die interpretierbaren ultra-kompakten UntervektorrĂ€ume -- Stimmung, Konkretheit, HĂ€ufigkeit und Wortart. Diese Rechenmethode beinhaltet Operationen wie ââ1 Ă Hass = Liebeâ und âneutrales Wort fĂŒr Winkeladvokatâ (d.h., Anwalt) und erweitert existierende Rechenmethoden, wie âOnkel â Mann + Frau = Tanteâ
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning
In recent years, there has been significant progress in developing
pre-trained language models for NLP. However, these models often struggle when
fine-tuned on small datasets. To address this issue, researchers have proposed
various adaptation approaches. Prompt-based tuning is arguably the most common
way, especially for larger models. Previous research shows that adding
contrastive learning to prompt-based fine-tuning is effective as it helps the
model generate embeddings that are more distinguishable between classes, and it
can also be more sample-efficient as the model learns from positive and
negative examples simultaneously. One of the most important components of
contrastive learning is data augmentation, but unlike computer vision,
effective data augmentation for NLP is still challenging. This paper proposes
LM-CPPF, Contrastive Paraphrasing-guided Prompt-based Fine-tuning of Language
Models, which leverages prompt-based few-shot paraphrasing using generative
language models, especially large language models such as GPT-3 and OPT-175B,
for data augmentation. Our experiments on multiple text classification
benchmarks show that this augmentation method outperforms other methods, such
as easy data augmentation, back translation, and multiple templates.Comment: 10 pages, 1 figure, 8 tables, 1 algorithm Proceedings of the 61st
Annual Meeting of the Association for Computational Linguistic
School Milk Consumption in Germany - What are Important Product Attributes for Children and Parents?
Food Consumption/Nutrition/Food Safety,