10 research outputs found

    Neural information extraction from natural language text

    Get PDF
    Natural language processing (NLP) deals with building computational techniques that allow computers to automatically analyze and meaningfully represent human language. With an exponential growth of data in this digital era, the advent of NLP-based systems has enabled us to easily access relevant information via a wide range of applications, such as web search engines, voice assistants, etc. To achieve it, a long-standing research for decades has been focusing on techniques at the intersection of NLP and machine learning. In recent years, deep learning techniques have exploited the expressive power of Artificial Neural Networks (ANNs) and achieved state-of-the-art performance in a wide range of NLP tasks. Being one of the vital properties, Deep Neural Networks (DNNs) can automatically extract complex features from the input data and thus, provide an alternative to the manual process of handcrafted feature engineering. Besides ANNs, Probabilistic Graphical Models (PGMs), a coupling of graph theory and probabilistic methods have the ability to describe causal structure between random variables of the system and capture a principled notion of uncertainty. Given the characteristics of DNNs and PGMs, they are advantageously combined to build powerful neural models in order to understand the underlying complexity of data. Traditional machine learning based NLP systems employed shallow computational methods (e.g., SVM or logistic regression) and relied on handcrafting features which is time-consuming, complex and often incomplete. However, deep learning and neural network based methods have recently shown superior results on various NLP tasks, such as machine translation, text classification, namedentity recognition, relation extraction, textual similarity, etc. These neural models can automatically extract an effective feature representation from training data. This dissertation focuses on two NLP tasks: relation extraction and topic modeling. The former aims at identifying semantic relationships between entities or nominals within a sentence or document. Successfully extracting the semantic relationships greatly contributes in building structured knowledge bases, useful in downstream NLP application areas of web search, question-answering, recommendation engines, etc. On other hand, the task of topic modeling aims at understanding the thematic structures underlying in a collection of documents. Topic modeling is a popular text-mining tool to automatically analyze a large collection of documents and understand topical semantics without actually reading them. In doing so, it generates word clusters (i.e., topics) and document representations useful in document understanding and information retrieval, respectively. Essentially, the tasks of relation extraction and topic modeling are built upon the quality of representations learned from text. In this dissertation, we have developed task-specific neural models for learning representations, coupled with relation extraction and topic modeling tasks in the realms of supervised and unsupervised machine learning paradigms, respectively. More specifically, we make the following contributions in developing neural models for NLP tasks: 1. Neural Relation Extraction: Firstly, we have proposed a novel recurrent neural network based architecture for table-filling in order to jointly perform entity and relation extraction within sentences. Then, we have further extended our scope of extracting relationships between entities across sentence boundaries, and presented a novel dependency-based neural network architecture. The two contributions lie in the supervised paradigm of machine learning. Moreover, we have contributed in building a robust relation extractor constrained by the lack of labeled data, where we have proposed a novel weakly-supervised bootstrapping technique. Given the contributions, we have further explored interpretability of the recurrent neural networks to explain their predictions for the relation extraction task. 2. Neural Topic Modeling: Besides the supervised neural architectures, we have also developed unsupervised neural models to learn meaningful document representations within topic modeling frameworks. Firstly, we have proposed a novel dynamic topic model that captures topics over time. Next, we have contributed in building static topic models without considering temporal dependencies, where we have presented neural topic modeling architectures that also exploit external knowledge, i.e., word embeddings to address data sparsity. Moreover, we have developed neural topic models that incorporate knowledge transfers using both the word embeddings and latent topics from many sources. Finally, we have shown improving neural topic modeling by introducing language structures (e.g., word ordering, local syntactic and semantic information, etc.) that deals with bag-of-words issues in traditional topic models. The class of proposed neural NLP models in this section are based on techniques at the intersection of PGMs, deep learning and ANNs. Here, the task of neural relation extraction employs neural networks to learn representations typically at the sentence level, without access to the broader document context. However, topic models have access to statistical information across documents. Therefore, we advantageously combine the two complementary learning paradigms in a neural composite model, consisting of a neural topic and a neural language model that enables us to jointly learn thematic structures in a document collection via the topic model, and word relations within a sentence via the language model. Overall, our research contributions in this dissertation extend NLP-based systems for relation extraction and topic modeling tasks with state-of-the-art performances

    Neural information extraction from natural language text

    Get PDF
    Natural language processing (NLP) deals with building computational techniques that allow computers to automatically analyze and meaningfully represent human language. With an exponential growth of data in this digital era, the advent of NLP-based systems has enabled us to easily access relevant information via a wide range of applications, such as web search engines, voice assistants, etc. To achieve it, a long-standing research for decades has been focusing on techniques at the intersection of NLP and machine learning. In recent years, deep learning techniques have exploited the expressive power of Artificial Neural Networks (ANNs) and achieved state-of-the-art performance in a wide range of NLP tasks. Being one of the vital properties, Deep Neural Networks (DNNs) can automatically extract complex features from the input data and thus, provide an alternative to the manual process of handcrafted feature engineering. Besides ANNs, Probabilistic Graphical Models (PGMs), a coupling of graph theory and probabilistic methods have the ability to describe causal structure between random variables of the system and capture a principled notion of uncertainty. Given the characteristics of DNNs and PGMs, they are advantageously combined to build powerful neural models in order to understand the underlying complexity of data. Traditional machine learning based NLP systems employed shallow computational methods (e.g., SVM or logistic regression) and relied on handcrafting features which is time-consuming, complex and often incomplete. However, deep learning and neural network based methods have recently shown superior results on various NLP tasks, such as machine translation, text classification, namedentity recognition, relation extraction, textual similarity, etc. These neural models can automatically extract an effective feature representation from training data. This dissertation focuses on two NLP tasks: relation extraction and topic modeling. The former aims at identifying semantic relationships between entities or nominals within a sentence or document. Successfully extracting the semantic relationships greatly contributes in building structured knowledge bases, useful in downstream NLP application areas of web search, question-answering, recommendation engines, etc. On other hand, the task of topic modeling aims at understanding the thematic structures underlying in a collection of documents. Topic modeling is a popular text-mining tool to automatically analyze a large collection of documents and understand topical semantics without actually reading them. In doing so, it generates word clusters (i.e., topics) and document representations useful in document understanding and information retrieval, respectively. Essentially, the tasks of relation extraction and topic modeling are built upon the quality of representations learned from text. In this dissertation, we have developed task-specific neural models for learning representations, coupled with relation extraction and topic modeling tasks in the realms of supervised and unsupervised machine learning paradigms, respectively. More specifically, we make the following contributions in developing neural models for NLP tasks: 1. Neural Relation Extraction: Firstly, we have proposed a novel recurrent neural network based architecture for table-filling in order to jointly perform entity and relation extraction within sentences. Then, we have further extended our scope of extracting relationships between entities across sentence boundaries, and presented a novel dependency-based neural network architecture. The two contributions lie in the supervised paradigm of machine learning. Moreover, we have contributed in building a robust relation extractor constrained by the lack of labeled data, where we have proposed a novel weakly-supervised bootstrapping technique. Given the contributions, we have further explored interpretability of the recurrent neural networks to explain their predictions for the relation extraction task. 2. Neural Topic Modeling: Besides the supervised neural architectures, we have also developed unsupervised neural models to learn meaningful document representations within topic modeling frameworks. Firstly, we have proposed a novel dynamic topic model that captures topics over time. Next, we have contributed in building static topic models without considering temporal dependencies, where we have presented neural topic modeling architectures that also exploit external knowledge, i.e., word embeddings to address data sparsity. Moreover, we have developed neural topic models that incorporate knowledge transfers using both the word embeddings and latent topics from many sources. Finally, we have shown improving neural topic modeling by introducing language structures (e.g., word ordering, local syntactic and semantic information, etc.) that deals with bag-of-words issues in traditional topic models. The class of proposed neural NLP models in this section are based on techniques at the intersection of PGMs, deep learning and ANNs. Here, the task of neural relation extraction employs neural networks to learn representations typically at the sentence level, without access to the broader document context. However, topic models have access to statistical information across documents. Therefore, we advantageously combine the two complementary learning paradigms in a neural composite model, consisting of a neural topic and a neural language model that enables us to jointly learn thematic structures in a document collection via the topic model, and word relations within a sentence via the language model. Overall, our research contributions in this dissertation extend NLP-based systems for relation extraction and topic modeling tasks with state-of-the-art performances

    Integrality and cutting planes in semidefinite programming approaches for combinatorial optimization

    Get PDF
    Many real-life decision problems are discrete in nature. To solve such problems as mathematical optimization problems, integrality constraints are commonly incorporated in the model to reflect the choice of finitely many alternatives. At the same time, it is known that semidefinite programming is very suitable for obtaining strong relaxations of combinatorial optimization problems. In this dissertation, we study the interplay between semidefinite programming and integrality, where a special focus is put on the use of cutting-plane methods. Although the notions of integrality and cutting planes are well-studied in linear programming, integer semidefinite programs (ISDPs) are considered only recently. We show that manycombinatorial optimization problems can be modeled as ISDPs. Several theoretical concepts, such as the Chvátal-Gomory closure, total dual integrality and integer Lagrangian duality, are studied for the case of integer semidefinite programming. On the practical side, we introduce an improved branch-and-cut approach for ISDPs and a cutting-plane augmented Lagrangian method for solving semidefinite programs with a large number of cutting planes. Throughout the thesis, we apply our results to a wide range of combinatorial optimization problems, among which the quadratic cycle cover problem, the quadratic traveling salesman problem and the graph partition problem. Our approaches lead to novel, strong and efficient solution strategies for these problems, with the potential to be extended to other problem classes

    Counting Problems on Quantum Graphs: Parameterized and Exact Complexity Classifications

    Get PDF
    Quantum graphs, as defined by Lovász in the late 60s, are formal linear combinations of simple graphs with finite support. They allow for the complexity analysis of the problem of computing finite linear combinations of homomorphism counts, the latter of which constitute the foundation of the structural hardness theory for parameterized counting problems: The framework of parameterized counting complexity was introduced by Flum and Grohe, and McCartin in 2002 and forms a hybrid between the classical field of computational counting as founded by Valiant in the late 70s and the paradigm of parameterized complexity theory due to Downey and Fellows which originated in the early 90s. The problem of computing homomorphism numbers of quantum graphs subsumes general motif counting problems and the complexity theoretic implications have only turned out recently in a breakthrough regarding the parameterized subgraph counting problem by Curticapean, Dell and Marx in 2017. We study the problems of counting partially injective and edge-injective homomorphisms, counting induced subgraphs, as well as counting answers to existential first-order queries. We establish novel combinatorial, algebraic and even topological properties of quantum graphs that allow us to provide exhaustive parameterized and exact complexity classifications, including necessary, sufficient and mostly explicit tractability criteria, for all of the previous problems.Diese Arbeit befasst sich mit der Komplexit atsanalyse von mathematischen Problemen die als Linearkombinationen von Graphhomomorphismenzahlen darstellbar sind. Dazu wird sich sogenannter Quantengraphen bedient, bei denen es sich um formale Linearkombinationen von Graphen handelt und welche von Lov asz Ende der 60er eingef uhrt wurden. Die Bestimmung der Komplexit at solcher Probleme erfolgt unter dem von Flum, Grohe und McCartin im Jahre 2002 vorgestellten Paradigma der parametrisierten Z ahlkomplexit atstheorie, die als Hybrid der von Valiant Ende der 70er begr undeten klassischen Z ahlkomplexit atstheorie und der von Downey und Fellows Anfang der 90er eingef uhrten parametrisierten Analyse zu verstehen ist. Die Berechnung von Homomorphismenzahlen zwischen Quantengraphen und Graphen subsumiert im weitesten Sinne all jene Probleme, die das Z ahlen von kleinen Mustern in gro en Strukturen erfordern. Aufbauend auf dem daraus resultierenden Durchbruch von Curticapean, Dell und Marx, das Subgraphz ahlproblem betre end, behandelt diese Arbeit die Analyse der Probleme des Z ahlens von partiell- und kanteninjektiven Homomorphismen, induzierten Subgraphen, und Tre ern von relationalen Datenbankabfragen die sich als existentielle Formeln ausdr ucken lassen. Insbesondere werden dabei neue kombinatorische, algebraische und topologische Eigenschaften von Quantengraphen etabliert, die hinreichende, notwendige und meist explizite Kriterien f ur die Existenz e zienter Algorithmen liefern

    Scholarly Communication Librarianship and Open Knowledge

    Get PDF
    The intersection of scholarly communication librarianship and open education offers a unique opportunity to expand knowledge of scholarly communication topics in both education and practice. Open resources can address the gap in teaching timely and critical scholarly communication topics—copyright in teaching and research environments, academic publishing, emerging modes of scholarship, impact measurement—while increasing access to resources and equitable participation in education and scholarly communication. Scholarly Communication Librarianship and Open Knowledge is an open textbook and practitioner’s guide that collects theory, practice, and case studies from nearly 80 experts in scholarly communication and open education. Divided into three parts: *What is Scholarly Communication? *Scholarly Communication and Open Culture *Voices from the Field: Perspectives, Intersections, and Case Studies The book delves into the economic, social, policy, and legal aspects of scholarly communication as well as open access, open data, open education, and open science and infrastructure. Practitioners provide insight into the relationship between university presses and academic libraries, defining collection development as operational scholarly communication, and promotion and tenure and the challenge for open access. Scholarly Communication Librarianship and Open Knowledge is a thorough guide meant to increase instruction on scholarly communication and open education issues and practices so library workers can continue to meet the changing needs of students and faculty. It is also a political statement about the future to which we aspire and a challenge to the industrial, commercial, capitalistic tendencies encroaching on higher education. Students, readers, educators, and adaptors of this resource can find and embrace these themes throughout the text and embody them in their work

    Modelling Software Project Management Complexity - An Assessment Model

    Get PDF
    During the last years, more and more business use projectised organisation as an organisation structure to tackle complex problems needed for the implementation of their strategic objectives. A significant number of these projects were/are challenged or even failed to meet their initial requirements in terms of cost, time and quality. This phenomenon is more intense in software projects due their special characteristics sourcing from the dynamic and continuous changing environment they operate and the nature of the software itself. Most of these failures were attributed to complexity that exists in various forms and levels at all projects. Many studies attempted to identify the sources of project complexity and define an appropriate complexity typology for capturing it. However, most of these studies are theoretical and only a limited number is proposing models capable to evaluate or measure project complexity. This research, acknowledges the endogenous character of complexity in projects but instead of trying to identify complexity dimensions of this complexity in projects, focuses on the complexity in the interfaces between project processes, project management processes and project managers, which consists of the critical point for successful project execution. The proposed framework can be used in order to highlight the most significant complexity areas either organisation specific or project specific, providing in that way the necessary awareness for better, efficient and effective project management. The approach followed in framework design, identifies the variation of perception of complexity between different organisations. Allow organisations to evaluate complexity of projects and provide them with an important information that will assist project selection process. Identifies the significance of peoples’ knowledge and experience and generally the maturity/capabilities of an organisation in management in order to handle complexity, as this was revealed through the findings of this research. Furthermore, considers complexity as variable that can be measured and propose a model for it. To implement this framework, an extended literature review was initially performed, for identifying the complexity factors sourcing from project management aspects. Subsequently, statistical methods for processing and refining the identified factors were used, resulting to the final set of measures used in the framework. Finally, the proposed model was validated through the appliance of case study methodolog

    Postsecondary peer cooperative learning programs: Annotated bibliography 2018

    Get PDF
    This 2018 annotated bibliography reviews seven postsecondary peer cooperative learning programs that have been implemented nationally and internationally to increase student achievement. An extensive literature search was conducted of published journal articles, newspaper accounts, book chapters, books, ERIC documents, thesis and dissertations, online documents, and unpublished reports. Peer learning programs in this bibliography meet the following characteristics: (a) program must have been implemented at the postsecondary or tertiary level, (b) program has a clear set of systematic procedures for its implementation at an institution, (c) program evaluation studies have been conducted and are available for review, (d) program intentionally embeds learning strategy practice along with a review of the academic content material, (e) program outcomes include both increased content knowledge with higher persistence rates, and (f) program has been replicated at another institution with similar positive student outcomes. From a review of the professional literature, nearly 1,500 citations emerged concerning seven programs that met the previously mentioned selection criteria: "Accelerated Learning Groups" (ALGs), "Emerging Scholars Program" (ESP), "Peer-Assisted Learning" (PAL), "Peer-Led Team Learning" (PLTL), "Structured Learning Assistance" (SLA), "Supplemental Instruction" (SI), and "Video-based Supplemental Instruction" (VSI). Nearly one fourth of the entries in this bibliography are from authors and researchers outside of United States. Guidance is provided to implement best practices of peer learning programs that can improve academic achievement, persistence to graduation, and professional growth of participants and facilitators of these student-led groups. The literature reports not only positive outcomes for the student participants of such programs, but includes outcomes for the student peer leaders of these academic support programs such as skill improvement with leadership, public speaking, and other employment skills along with an impact of their future vocational choices including a career in teaching at the secondary or postsecondary level. Educators need to investigate these peer learning programs to discover effective learning practices that can be adapted and adopted for use in supporting higher student achievement for students of diverse backgrounds. [This annotated bibliography is a revised and expanded version of ED565496, ED545639, ED489957, and ED574832

    A Holmes and Doyle Bibliography, Volume 5: Periodical Articles--Secondary References, Alphabetical Listing

    Get PDF
    This bibliography is a work in progress. It attempts to update Ronald B. De Waal’s comprehensive bibliography, The Universal Sherlock Holmes, but does not claim to be exhaustive in content. New works are continually discovered and added to this bibliography. Readers and researchers are invited to suggest additional content. Volume 5 includes "passing" or "secondary" references, i.e. those entries that are passing in nature or contain very brief information or content

    A Holmes and Doyle Bibliography, Volume 6: Periodical Articles, Subject Listing, By De Waal Category

    Get PDF
    This bibliography is a work in progress. It attempts to update Ronald B. De Waal’s comprehensive bibliography, The Universal Sherlock Holmes, but does not claim to be exhaustive in content. New works are continually discovered and added to this bibliography. Readers and researchers are invited to suggest additional content. Volume 6 presents the periodical literature arranged by subject categories (as originally devised for the De Waal bibliography and slightly modified here)

    A Holmes and Doyle Bibliography, Volume 9: All Formats—Combined Alphabetical Listing

    Get PDF
    This bibliography is a work in progress. It attempts to update Ronald B. De Waal’s comprehensive bibliography, The Universal Sherlock Holmes, but does not claim to be exhaustive in content. New works are continually discovered and added to this bibliography. Readers and researchers are invited to suggest additional content. This volume contains all listings in all formats, arranged alphabetically by author or main entry. In other words, it combines the listings from Volume 1 (Monograph and Serial Titles), Volume 3 (Periodical Articles), and Volume 7 (Audio/Visual Materials) into a comprehensive bibliography. (There may be additional materials included in this list, e.g. duplicate items and items not yet fully edited.) As in the other volumes, coverage of this material begins around 1994, the final year covered by De Waal's bibliography, but may not yet be totally up-to-date (given the ongoing nature of this bibliography). It is hoped that other titles will be added at a later date. At present, this bibliography includes 12,594 items
    corecore