845 research outputs found

    Deep neural networks for identification of sentential relations

    Get PDF
    Natural language processing (NLP) is one of the most important technologies in the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate mostly in language: web search, advertisement, emails, customer service, language translation, etc. There are a large variety of underlying tasks and machine learning models powering NLP applications. Recently, deep learning approaches have obtained exciting performance across a broad array of NLP tasks. These models can often be trained in an end-to-end paradigm without traditional, task-specific feature engineering. This dissertation focuses on a specific NLP task --- sentential relation identification. Successfully identifying the relations of two sentences can contribute greatly to some downstream NLP problems. For example, in open-domain question answering, if the system can recognize that a new question is a paraphrase of a previously observed question, the known answers can be returned directly, avoiding redundant reasoning. For another, it is also helpful to discover some latent knowledge, such as inferring ``the weather is good today'' from another description ``it is sunny today''. This dissertation presents some deep neural networks (DNNs) which are developed to handle this sentential relation identification problem. More specifically, this problem is addressed by this dissertation in the following three aspects. (i) Sentential relation representation is built on the matching between phrases of arbitrary lengths. Stacked Convolutional Neural Networks (CNNs) are employed to model the sentences, so that each filter can cover a local phrase, and filters in lower level span shorter phrases and filters in higher level span longer phrases. CNNs in stack enable to model sentence phrases in different granularity and different abstraction. (ii) Phrase matches contribute differently to the tasks. This motivates us to propose an attention mechanism in CNNs for these tasks, differing from the popular research of attention mechanisms in Recurrent Neural Networks (RNNs). Attention mechanisms are implemented in both convolution layer as well as pooling layer in deep CNNs, in order to figure out automatically which phrase of one sentence matches a specific phrase of the other sentence. These matches are supposed to be indicative to the final decision. Another contribution in terms of attention mechanism is inspired by the observation that some sentential relation identification task, like answer selection for multi-choice question answering, is mainly determined by phrase alignments of stronger degree; in contrast, some tasks such as textual entailment benefit more from the phrase alignments of weaker degree. This motivates us to propose a dynamic ``attentive pooling'' to select phrase alignments of different intensities for different task categories. (iii) In certain scenarios, sentential relation can only be successfully identified within specific background knowledge, such as the multi-choice question answering based on passage comprehension. In this case, the relation between two sentences (question and answer candidate) depends on not only the semantics in the two sentences, but also the information encoded in the given passage. Overall, the work in this dissertation models sentential relations in hierarchical DNNs, different attentions and different background knowledge. All systems got state-of-the-art performances in representative tasks.Die Verarbeitung natürlicher Sprachen (engl.: natural language processing - NLP) ist eine der wichtigsten Technologien des Informationszeitalters. Weiterhin ist das Verstehen komplexer sprachlicher Ausdrücke ein essentieller Teil künstlicher Intelligenz. Anwendungen von NLP sind überall zu finden, da Menschen haupt\-säch\-lich über Sprache kommunizieren: Internetsuchen, Werbung, E-Mails, Kundenservice, Übersetzungen, etc. Es gibt eine große Anzahl Tasks und Modelle des maschinellen Lernens für NLP-Anwendungen. In den letzten Jahren haben Deep-Learning-Ansätze vielversprechende Ergebnisse für eine große Anzahl verschiedener NLP-Tasks erzielt. Diese Modelle können oft end-to-end trainiert werden, kommen also ohne auf den Task zugeschnittene Feature aus. Diese Dissertation hat einen speziellen NLP-Task als Fokus: Sententielle Relationsidentifizierung. Die Beziehung zwischen zwei Sätzen erfolgreich zu erkennen, kann die Performanz für nachfolgende NLP-Probleme stark verbessern. Für open-domain question answering, zum Beispiel, kann ein System, das erkennt, dass eine neue Frage eine Paraphrase einer bereits gesehenen Frage ist, die be\-kann\-te Antwort direkt zurückgeben und damit mehrfaches Schlussfolgern vermeiden. Zudem ist es auch hilfreich, zu Grunde liegendes Wissen zu entdecken, so wie das Schließen der Tatsache "das Wetter ist gut" aus der Beschreibung "es ist heute sonnig". Diese Dissertation stellt einige tiefe neuronale Netze (eng.: deep neural networks - DNNs) vor, die speziell für das Problem der sententiellen Re\-la\-tions\-i\-den\-ti\-fi\-zie\-rung entwickelt wurden. Im Speziellen wird dieses Problem in dieser Dissertation unter den folgenden drei Aspekten behandelt: (i) Sententielle Relationsrepr\"{a}sentationen basieren auf einem Matching zwischen Phrasen beliebiger Länge. Tiefe convolutional neural networks (CNNs) werden verwendet, um diese Sätze zu modellieren, sodass jeder Filter eine lokale Phrase abdecken kann, wobei Filter in niedrigeren Schichten kürzere und Filter in höheren Schichten längere Phrasen umfassen. Tiefe CNNs machen es möglich, Sätze in unterschiedlichen Granularitäten und Abstraktionsleveln zu modellieren. (ii) Matches zwischen Phrasen tragen unterschiedlich zu unterschiedlichen Tasks bei. Das motiviert uns, einen Attention-Mechanismus für CNNs für diese Tasks einzuführen, der sich von dem bekannten Attention-Mechanismus für recurrent neural networks (RNNs) unterscheidet. Wir implementieren Attention-Mechanismen sowohl im convolution layer als auch im pooling layer tiefer CNNs, um herauszufinden, welche Phrasen eines Satzes bestimmten Phrasen eines anderen Satzes entsprechen. Wir erwarten, dass solche Matches die finale Entscheidung stark beeinflussen. Ein anderer Beitrag zu Attention-Mechanismen wurde von der Beobachtung inspiriert, dass einige sententielle Relationsidentifizierungstasks, zum Beispiel die Auswahl einer Antwort für multi-choice question answering hauptsächlich von Phrasen\-a\-lignie\-rungen stärkeren Grades bestimmt werden. Im Gegensatz dazu profitieren andere Tasks wie textuelles Schließen mehr von Phrasenalignierungen schwächeren Grades. Das motiviert uns, ein dynamisches "attentive pooling" zu entwickeln, um Phrasenalignierungen verschiedener Stärken für verschiedene Taskkategorien auszuwählen. (iii) In bestimmten Szenarien können sententielle Relationen nur mit entsprechendem Hintergrundwissen erfolgreich identifiziert werden, so wie multi-choice question answering auf der Grundlage des Verständnisses eines Absatzes. In diesem Fall hängt die Relation zwischen zwei Sätzen (der Frage und der möglichen Antwort) nicht nur von der Semantik der beiden Sätze, sondern auch von der in dem gegebenen Absatz enthaltenen Information ab. Insgesamt modellieren die in dieser Dissertation enthaltenen Arbeiten sententielle Relationen in hierarchischen DNNs, mit verschiedenen Attention-Me\-cha\-nis\-men und wenn unterschiedliches Hintergrundwissen zur Verf\ {u}gung steht. Alle Systeme erzielen state-of-the-art Ergebnisse für die entsprechenden Tasks

    Neural Representations of Concepts and Texts for Biomedical Information Retrieval

    Get PDF
    Information retrieval (IR) methods are an indispensable tool in the current landscape of exponentially increasing textual data, especially on the Web. A typical IR task involves fetching and ranking a set of documents (from a large corpus) in terms of relevance to a user\u27s query, which is often expressed as a short phrase. IR methods are the backbone of modern search engines where additional system-level aspects including fault tolerance, scale, user interfaces, and session maintenance are also addressed. In addition to fetching documents, modern search systems may also identify snippets within the documents that are potentially most relevant to the input query. Furthermore, current systems may also maintain preprocessed structured knowledge derived from textual data as so called knowledge graphs, so certain types of queries that are posed as questions can be parsed as such; a response can be an output of one or more named entities instead of a ranked list of documents (e.g., what diseases are associated with EGFR mutations? ). This refined setup is often termed as question answering (QA) in the IR and natural language processing (NLP) communities. In biomedicine and healthcare, specialized corpora are often at play including research articles by scientists, clinical notes generated by healthcare professionals, consumer forums for specific conditions (e.g., cancer survivors network), and clinical trial protocols (e.g., www.clinicaltrials.gov). Biomedical IR is specialized given the types of queries and the variations in the texts are different from that of general Web documents. For example, scientific articles are more formal with longer sentences but clinical notes tend to have less grammatical conformity and are rife with abbreviations. There is also a mismatch between the vocabulary of consumers and the lingo of domain experts and professionals. Queries are also different and can range from simple phrases (e.g., COVID-19 symptoms ) to more complex implicitly fielded queries (e.g., chemotherapy regimens for stage IV lung cancer patients with ALK mutations ). Hence, developing methods for different configurations (corpus, query type, user type) needs more deliberate attention in biomedical IR. Representations of documents and queries are at the core of IR methods and retrieval methodology involves coming up with these representations and matching queries with documents based on them. Traditional IR systems follow the approach of keyword based indexing of documents (the so called inverted index) and matching query phrases against the document index. It is not difficult to see that this keyword based matching ignores the semantics of texts (synonymy at the lexeme level and entailment at phrase/clause/sentence levels) and this has lead to dimensionality reduction methods such as latent semantic indexing that generally have scale-related concerns; such methods also do not address similarity at the sentence level. Since the resurgence of neural network methods in NLP, the IR field has also moved to incorporate advances in neural networks into current IR methods. This dissertation presents four specific methodological efforts toward improving biomedical IR. Neural methods always begin with dense embeddings for words and concepts to overcome the limitations of one-hot encoding in traditional NLP/IR. In the first effort, we present a new neural pre-training approach to jointly learn word and concept embeddings for downstream use in applications. In the second study, we present a joint neural model for two essential subtasks of information extraction (IE): named entity recognition (NER) and entity normalization (EN). Our method detects biomedical concept phrases in texts and links them to the corresponding semantic types and entity codes. These first two studies provide essential tools to model textual representations as compositions of both surface forms (lexical units) and high level concepts with potential downstream use in QA. In the third effort, we present a document reranking model that can help surface documents that are likely to contain answers (e.g, factoids, lists) to a question in a QA task. The model is essentially a sentence matching neural network that learns the relevance of a candidate answer sentence to the given question parametrized with a bilinear map. In the fourth effort, we present another document reranking approach that is tailored for precision medicine use-cases. It combines neural query-document matching and faceted text summarization. The main distinction of this effort from previous efforts is to pivot from a query manipulation setup to transforming candidate documents into pseudo-queries via neural text summarization. Overall, our contributions constitute nontrivial advances in biomedical IR using neural representations of concepts and texts

    Exploiting semantic annotations for open information extraction: an experience in the biomedical domain

    Get PDF
    The increasing amount of unstructured text published on the Web is demanding new tools and methods to automatically process and extract relevant information. Traditional information extraction has focused on harvesting domain-specific, pre-specified relations, which usually requires manual labor and heavy machinery; especially in the biomedical domain, the main efforts have been directed toward the recognition of well-defined entities such as genes or proteins, which constitutes the basis for extracting the relationships between the recognized entities. The intrinsic features and scale of the Web demand new approaches able to cope with the diversity of documents, where the number of relations is unbounded and not known in advance. This paper presents a scalable method for the extraction of domain-independent relations from text that exploits the knowledge in the semantic annotations. The method is not geared to any specific domain (e.g., protein–protein interactions and drug–drug interactions) and does not require any manual input or deep processing. Moreover, the method uses the extracted relations to compute groups of abstract semantic relations characterized by their signature types and synonymous relation strings. This constitutes a valuable source of knowledge when constructing formal knowledge bases, as we enable seamless integration of the extracted relations with the available knowledge resources through the process of semantic annotation. The proposed approach has successfully been applied to a large text collection in the biomedical domain and the results are very encouraging.The work was supported by the CICYT project TIN2011-24147 from the Spanish Ministry of Economy and Competitiveness (MINECO)

    Context-Independent Task Knowledge for Neurosymbolic Reasoning in Cognitive Robotics

    Get PDF
    One of the current main goals of artificial intelligence and robotics research is the creation of an artificial assistant which can have flexible, human like behavior, in order to accomplish everyday tasks. A lot of what is context-independent task knowledge to the human is what enables this flexibility at multiple levels of cognition. In this scope the author analyzes how to acquire, represent and disambiguate symbolic knowledge representing context-independent task knowledge, abstracted from multiple instances: this thesis elaborates the incurred problems, implementation constraints, current state-of-the-art practices and ultimately the solutions newly introduced in this scope. The author specifically discusses acquisition of context-independent task knowledge from large amounts of human-written texts and their reusability in the robotics domain; the acquisition of knowledge on human musculoskeletal dependencies constraining motion which allows a better higher level representation of observed trajectories; the means of verbalization of partial contextual and instruction knowledge, increasing interaction possibilities with the human as well as contextual adaptation. All the aforementioned points are supported by evaluation in heterogeneous setups, to bring a view on how to make optimal use of statistical & symbolic applications (i.e. neurosymbolic reasoning) in cognitive robotics. This work has been performed to enable context-adaptable artificial assistants, by bringing together knowledge on what is usually regarded as context-independent task knowledge

    Representation Learning for Natural Language Processing

    Get PDF
    This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing

    Knowledge-Based Techniques for Scholarly Data Access: Towards Automatic Curation

    Get PDF
    Accessing up-to-date and quality scientific literature is a critical preliminary step in any research activity. Identifying relevant scholarly literature for the extents of a given task or application is, however a complex and time consuming activity. Despite the large number of tools developed over the years to support scholars in their literature surveying activity, such as Google Scholar, Microsoft Academic search, and others, the best way to access quality papers remains asking a domain expert who is actively involved in the field and knows research trends and directions. State of the art systems, in fact, either do not allow exploratory search activity, such as identifying the active research directions within a given topic, or do not offer proactive features, such as content recommendation, which are both critical to researchers. To overcome these limitations, we strongly advocate a paradigm shift in the development of scholarly data access tools: moving from traditional information retrieval and filtering tools towards automated agents able to make sense of the textual content of published papers and therefore monitor the state of the art. Building such a system is however a complex task that implies tackling non trivial problems in the fields of Natural Language Processing, Big Data Analysis, User Modelling, and Information Filtering. In this work, we introduce the concept of Automatic Curator System and present its fundamental components.openDottorato di ricerca in InformaticaopenDe Nart, Dari

    Artificial intelligence for ocean science data integration:current state, gaps, and way forward

    Get PDF
    • …
    corecore