546 research outputs found
Using Neural Networks for Relation Extraction from Biomedical Literature
Using different sources of information to support automated extracting of
relations between biomedical concepts contributes to the development of our
understanding of biological systems. The primary comprehensive source of these
relations is biomedical literature. Several relation extraction approaches have
been proposed to identify relations between concepts in biomedical literature,
namely, using neural networks algorithms. The use of multichannel architectures
composed of multiple data representations, as in deep neural networks, is
leading to state-of-the-art results. The right combination of data
representations can eventually lead us to even higher evaluation scores in
relation extraction tasks. Thus, biomedical ontologies play a fundamental role
by providing semantic and ancestry information about an entity. The
incorporation of biomedical ontologies has already been proved to enhance
previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1
Selecting and Generating Computational Meaning Representations for Short Texts
Language conveys meaning, so natural language processing (NLP) requires representations of meaning. This work addresses two broad questions: (1) What meaning representation should we use? and (2) How can we transform text to our chosen meaning representation? In the first part, we explore different meaning representations (MRs) of short texts, ranging from surface forms to deep-learning-based models. We show the advantages and disadvantages of a variety of MRs for summarization, paraphrase detection, and clustering. In the second part, we use SQL as a running example for an in-depth look at how we can parse text into our chosen MR. We examine the text-to-SQL problem from three perspectives—methodology, systems, and applications—and show how each contributes to a fuller understanding of the task.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143967/1/cfdollak_1.pd
Scalable syntactic inductive biases for neural language models
Natural language has a sequential surface form, although its underlying structure has been argued to be hierarchical and tree-structured in nature, whereby smaller linguistic units like words are recursively composed to form larger ones, such as phrases and sentences. This thesis aims to answer the following open research questions: To what extent---if at all---can more explicit notions of hierarchical syntactic structures further improve the performance of neural models within NLP, even within the context of successful models like BERT that learn from large amounts of data? And where exactly would stronger notions of syntactic structures be beneficial in different types of language understanding tasks?
To answer these questions, we explore two approaches for augmenting neural sequence models with an inductive bias that encourages a more explicit modelling of hierarchical syntactic structures. In the first approach, we use existing techniques that design tree-structured neural networks, where the ordering of the computational operations is determined by hierarchical syntax trees. We discover that this approach is indeed effective for designing better and more robust models at various challenging benchmarks of syntactic competence, although these benefits nevertheless come at the expense of scalability: In practice, such tree-structured models are much more challenging to scale to large datasets.
Hence, in the second approach, we devise a novel knowledge distillation strategy for combining the best of both syntactic inductive biases and data scale. Our proposed approach is effective across different neural sequence modelling architectures and objective functions: By applying our approach on top of a left-to-right LSTM, we design a distilled syntax-aware (DSA) LSTM that achieves a new state of the art (as of mid-2019) and human-level performance at targeted syntactic evaluations. By applying our approach on top of a Transformer-based BERT masked language model that works well at scale, we outperform a strong BERT baseline on six structured prediction tasks---including those that are not explicitly syntactic in nature---in addition to the corpus of linguistic acceptability. Notably, our approach yields a new state of the art (as of mid-2020)---among models pre-trained on the original BERT dataset---on four structured prediction tasks: In-domain and out-of-domain phrase-structure parsing, dependency parsing, and semantic role labelling.
Altogether, our findings and methods in this work: (i) provide an example of how existing linguistic theories (particularly concerning the syntax of language), annotations, and resources can be used both as diagnostic evaluation tools, and also as a source of prior knowledge for crafting inductive biases that can improve the performance of computational models of language; (ii) showcase the continued relevance and benefits of more explicit syntactic inductive biases, even within the context of scalable neural models like BERT that can derive their knowledge from large amounts of data; (iii) contribute to a better understanding of where exactly syntactic biases are most helpful in different types of NLP tasks; and (iv) motivate the broader question of how we can design models that integrate stronger syntactic biases---and yet can be easily scalable at the same time---as a promising (if relatively underexplored) direction of NLP research
An Augmented Encoder to Generate and Evaluate Paraphrases in Punjabi Language
Paraphrase generation is an important task in Natural Language Processing (NLP) and is successfully applied in various applications such as question-answering, information retrieval & extraction, text summarization and augmentation of machine translation training data. A lot of research has been carried out on paraphrase generation but in the language of English only. However, no approach is available for paraphrase generation in Punjabi Language. Hence, this paper aims to plug in the gap by developing a paraphrase generation and evaluation model for the language of Punjabi. The proposed approach is divided into two phases: paraphrase generation and evaluation. To generate paraphrases, the current state-of-the-art transformer with improved encoder is being used as transformers can learn long-term dependencies. For evaluation, the sentence embeddings are used to check whether the generated paraphrase is similar to the given sentence or not. The sentence embeddings have been created using two approaches: Seq2Seq with attention and transformers. The proposed model is compared with the currently available state-of-the-art models on Quora Question pair dataset. However, for Punjabi, the proposed approach is evaluated on three datasets: news headlines, the sentential dataset from news articles and the third dataset is the translation of Quora Question pair into Punjabi. The automatic evaluation metrics BLEU, METEOR and ROUGE are used for depth evaluation along with human judgments. The proposed approach is straightforward and successfully applies for augmenting machine translation training data and sentence compression. The proposed approach establishes a new baseline for paraphrase generation in Indian regional languages in the future
- …