Search CORE

195 research outputs found

Automatic Discovery of Word Semantic Relations

Author: Ahonen-Myka Helena
Cordeiro Joao
Dias Gael
Doucet Antoine
Moraliyski Rumen
Publication venue: University Press "Paisii Hilendarski", Plovdiv
Publication date: 22/11/2010
Field of study

In this paper, we propose an unsupervised methodology to automatically discover pairs of semantically related words by highlighting their local environment and evaluating their semantic similarity in local and global semantic spaces. This proposal di®ers from previous research as it tries to take the best of two different methodologies i.e. semantic space models and information extraction models. It can be applied to extract close semantic relations, it limits the search space and it is unsupervised

Bulgarian Digital Mathematics Library at IMI-BAS

Advancing duplicate question detection with deep learning

Author: Pumnea A.M.
Publication venue
Publication date: 28/01/2019
Field of study

Pure OAI Repository

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Author: Fu Jie
Hui Siu Cheung
Rao Jinfeng
Tay Yi
Tuan Luu Anh
Wang Shuohang
Zhang Aston
Zhang Shuai
Publication venue
Publication date: 01/01/2019
Field of study

Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling not only expressive inter-component interactions but also significantly (

75\%

) reduced parameter size due to lesser degrees of freedom in the Hamilton product. We propose Quaternion variants of models, giving rise to new architectures such as the Quaternion attention Model and Quaternion Transformer. Extensive experiments on a battery of NLP tasks demonstrates the utility of proposed Quaternion-inspired models, enabling up to

75\%

reduction in parameter size without significant loss in performance.Comment: ACL 201

arXiv.org e-Print Archive

Crossref

PolyPublie

Recommended from our members

Towards Robust Long-form Text Generation Systems

Author: Krishna Kalpesh
Publication venue: ScholarWorks@UMass Amherst
Publication date: 15/11/2023
Field of study

Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to world knowledge and the input prompt; (2) it is difficult to accurately evaluate the quality of long-form generated text; (3) it is difficult to identify whether a piece of long-form text was AI-generated, a task necessary to prevent widespread misinformation and plagiarism. In this thesis I design algorithms aimed at making progress towards these three issues in current LLMs. I will first describe a retrieval-augmented system we built for long-form question answering, to improve factual correctness of long-form generated text. However, a careful empirical analysis reveals issues related to input/output consistency of generated text, and an inherent difficulty in evaluation. I will then describe our model RankGen, which uses large-scale contrastive learning on documents to significantly outperform competing long-form text generation methods to generate text more faithful to the input. Next, I will describe our efforts to improve human evaluation of long-form generation (issue #2) by proposing the LongEval guidelines. LongEval is a set of three simple empirically-motivated ideas to make human evaluation of long-form generation more consistent, less expensive, and cognitively easier for evaluators. Finally, I describe my work on AI-generated text detection (issue #3), and showcase the brittleness of existing methods to paraphrasing attacks I designed. I will describe a simple new AI-generated text detection algorithm using information retrieval, which is significantly more robust to paraphrasing attacks. Finally, I conclude this thesis with some future research directions that I am excited about, including plan-based long-form text generation, and a deeper dive into understanding large language model training dynamics

ScholarWorks@UMass Amherst

Recommended from our members

The Value of Everything: Ranking and Association with Encyclopedic Knowledge

Author: Coursey Kino High
Publication venue: 'University of North Texas Libraries'
Publication date: 01/12/2009
Field of study

This dissertation describes WikiRank, an unsupervised method of assigning relative values to elements of a broad coverage encyclopedic information source in order to identify those entries that may be relevant to a given piece of text. The valuation given to an entry is based not on textual similarity but instead on the links that associate entries, and an estimation of the expected frequency of visitation that would be given to each entry based on those associations in context. This estimation of relative frequency of visitation is embodied in modifications to the random walk interpretation of the PageRank algorithm. WikiRank is an effective algorithm to support natural language processing applications. It is shown to exceed the performance of previous machine learning algorithms for the task of automatic topic identification, providing results comparable to that of human annotators. Second, WikiRank is found useful for the task of recognizing text-based paraphrases on a semantic level, by comparing the distribution of attention generated by two pieces of text using the encyclopedic resource as a common reference. Finally, WikiRank is shown to have the ability to use its base of encyclopedic knowledge to recognize terms from different ontologies as describing the same thing, and thus allowing for the automatic generation of mapping links between ontologies. The conclusion of this thesis is that the "knowledge access heuristic" is valuable and that a ranking process based on a large encyclopedic resource can form the basis for an extendable general purpose mechanism capable of identifying relevant concepts by association, which in turn can be effectively utilized for enumeration and comparison at a semantic level

UNT Digital Library

Semantic Parsing in Limited Resource Conditions

Author: Li Zhuang
Publication venue
Publication date: 14/09/2023
Field of study

This thesis explores challenges in semantic parsing, specifically focusing on scenarios with limited data and computational resources. It offers solutions using techniques like automatic data curation, knowledge transfer, active learning, and continual learning. For tasks with no parallel training data, the thesis proposes generating synthetic training examples from structured database schemas. When there is abundant data in a source domain but limited parallel data in a target domain, knowledge from the source is leveraged to improve parsing in the target domain. For multilingual situations with limited data in the target languages, the thesis introduces a method to adapt parsers using a limited human translation budget. Active learning is applied to select source-language samples for manual translation, maximizing parser performance in the target language. In addition, an alternative method is also proposed to utilize machine translation services, supplemented by human-translated data, to train a more effective parser. When computational resources are limited, a continual learning approach is introduced to minimize training time and computational memory. This maintains the parser's efficiency in previously learned tasks while adapting it to new tasks, mitigating the problem of catastrophic forgetting. Overall, the thesis provides a comprehensive set of methods to improve semantic parsing in resource-constrained conditions.Comment: PhD thesis, year of award 2023, 172 page

arXiv.org e-Print Archive

From Language Comprehension Towards General AI

Author: Dahal Binay
Publication venue: Digital Scholarship@UNLV
Publication date: 01/12/2021
Field of study

Language comprehension or more formally, natural language understanding is one of the major undertakings in Artificial Intelligence. In this work, we explore a few of the problems in language understanding using fixed deep learning models. Specifically, first, we look into question generation. Asking questions relates to the cognitive ability of language comprehension and context understanding. For that reason, making progress in question generation is significant. We introduce a novel task called “question generation with masked target answer” and propose various models and present the baseline result for the task. Next, we extend on the question generation task and develop a large-scale dataset for our task and for question generation in general. Next, we explore the problem of paraphrase identification, in which the task is to decide whether a pair of sentences is a paraphrase of each other. We present various machine learning models and discuss their performance. Moving on from the fixed architecture of deep learning models, we then explore the area of neuroevolution where the models constantly change based on some evolutionary operators and learn until an optimal architecture is found. This direction promises to create a more general form of intelligence. In particular, we formulate a recombination algorithm called Highest Varying k-Features Recombination(HVk-FR) and use it on top of various mutation operators to evolve the models. We show how our proposed algorithm can actually go in the direction of optimal network structure starting from a basic one-layer deep network

University of Nevada, Las Vegas Repository

Computational Understanding, Generation and Evaluation of Creative Expressions

Author: Alnajjar Khalid
Publication venue: 'University of Helsinki Libraries'
Publication date: 22/03/2021
Field of study

Computational creativity has received a good amount of research interest in generating creative artefacts programmatically. At the same time, research has been conducted in computational aesthetics, which essentially tries to analyse creativity exhibited in art. This thesis aims to unite these two distinct lines of research in the context of natural language generation by building, from models for interpretation and generation, a cohesive whole that can assess its own generations. I present a novel method for interpreting one of the most difficult rhetoric devices in the figurative use of language: metaphors. The method does not rely on hand-annotated data and it is purely data-driven. It obtains the state of the art results and is comparable to the interpretations given by humans. We show how a metaphor interpretation model can be used in generating metaphors and metaphorical expressions. Furthermore, as a creative natural language generation task, we demonstrate assigning creative names to colours using an algorithmic approach that leverages a knowledge base of stereotypical associations for colours. Colour names produced by the approach were favoured by human judges to names given by humans 70% of the time. A genetic algorithm-based method is elaborated for slogan generation. The use of a genetic algorithm makes it possible to model the generation of text while optimising multiple fitness functions, as part of the evolutionary process, to assess the aesthetic quality of the output. Our evaluation indicates that having multiple balanced aesthetics outperforms a single maximised aesthetic. From an interplay of neural networks and the traditional AI approach of genetic algorithms, we present a symbiotic framework. This is called the master-apprentice framework. This makes it possible for the system to produce more diverse output as the neural network can learn from both the genetic algorithm and real people. The master-apprentice framework emphasises a strong theoretical foundation for the creative problem one seeks to solve. From this theoretical foundation, a reasoned evaluation method can be derived. This thesis presents two different evaluation practices based on two different theories on computational creativity. This research is conducted in two distinct practical tasks: pun generation in English and poetry generation in Finnish.Laskennallista luovuutta on tutkittu paljon puhtaan tuottamisen näkökulmasta ja saman aikaan tutkimusta on tehty laskennallisen estetiikan saralla. Väitöskirjani yhdistää näitä kahta eri koulukuntaa, sillä kehittämäni laskennallisesti luovat järjestelmät käyttävät tuottamisessa apuna estetiikkaa; järjestelmät siis tulkitsevat teoksiaan samaan aikaan, kun ne niitä tuottavat. Käsittelen väitöskirjassani metaforien automaattista tulkintaa, värien nimien tuottamista, sloganien tuottamista sekä suomenkielisen runouden tuottamista. Metodeina käytän perinteistä koneoppimisalgoritmia, eli niin kutsuttua geneettistä algoritmia, sekä neuroverkkoja. Niiden yhdistelmää nimitän mestari ja oppipoika -malliksi, jossa geneettinen algoritmi opettaa neuroverkkoja

Helsingin yliopiston digitaalinen arkisto

PRIOR: Prototype Representation Joint Learning from Medical Images and Reports

Author: Cheng Pujin
Huang Yijin
Lin Li
Luo Wenhan
Lyu Junyan
Tang Xiaoying
Publication venue
Publication date: 24/07/2023
Field of study

Contrastive learning based vision-language joint pre-training has emerged as a successful representation learning strategy. In this paper, we present a prototype representation learning framework incorporating both global and local alignment between medical images and reports. In contrast to standard global multi-modality alignment methods, we employ a local alignment module for fine-grained representation. Furthermore, a cross-modality conditional reconstruction module is designed to interchange information across modalities in the training phase by reconstructing masked images and reports. For reconstructing long reports, a sentence-wise prototype memory bank is constructed, enabling the network to focus on low-level localized visual and high-level clinical linguistic features. Additionally, a non-auto-regressive generation paradigm is proposed for reconstructing non-sequential reports. Experimental results on five downstream tasks, including supervised classification, zero-shot classification, image-to-text retrieval, semantic segmentation, and object detection, show the proposed method outperforms other state-of-the-art methods across multiple datasets and under different dataset size settings. The code is available at https://github.com/QtacierP/PRIOR.Comment: Accepted by ICCV 202

arXiv.org e-Print Archive