5,334 research outputs found

    Semantic Relation Extraction. Resources, Tools and Strategies

    Get PDF
    [Abstract] Relation extraction is a subtask of information extraction that aims at obtaining instances of semantic relations present in texts. This information can be arranged in machine-readable formats, useful for several applications that need structured semantic knowledge. The work presented in this paper explores different strategies to automate the extraction of semantic relations from texts in Portuguese, Galician and Spanish. Both machine learning (distant-supervised and supervised) and rule-based techniques are investigated, and the impact of the different levels of linguistic knowledge is analyzed for the various approaches. Regarding domains, the experiments are focused on the extraction of encyclopedic knowledge, by means of the development of biographical relations classifiers (in a closed domain) and the evaluation of an open information extraction tool. To implement the extraction systems, several natural language processing tools have been built for the three research languages: From sentence splitting and tokenization modules to part-of-speech taggers, named entity recognizers and coreference resolution systems. Furthermore, several lexica and corpora have been compiled and enriched with different levels of linguistic annotation, which are useful for both training and testing probabilistic and symbolic models. As a result of the performed work, new resources and tools are available for automated processing of texts in Portuguese, Galician and Spanish.Ministerio de EconomĂ­a y Competitividad; FFI2014-51978-C2-1-RMinisterio de EconomĂ­a y Competitividad; FJCI-2014-2285

    DARIAH and the Benelux

    Get PDF

    Creation, Enrichment and Application of Knowledge Graphs

    Get PDF
    The world is in constant change, and so is the knowledge about it. Knowledge-based systems - for example, online encyclopedias, search engines and virtual assistants - are thus faced with the constant challenge of collecting this knowledge and beyond that, to understand it and make it accessible to their users. Only if a knowledge-based system is capable of this understanding - that is, it is capable of more than just reading a collection of words and numbers without grasping their semantics - it can recognise relevant information and make it understandable to its users. The dynamics of the world play a unique role in this context: Events of various kinds which are relevant to different communities are shaping the world, with examples ranging from the coronavirus pandemic to the matches of a local football team. Vital questions arise when dealing with such events: How to decide which events are relevant, and for whom? How to model these events, to make them understood by knowledge-based systems? How is the acquired knowledge returned to the users of these systems? A well-established concept for making knowledge understandable by knowledge-based systems are knowledge graphs, which contain facts about entities (persons, objects, locations, ...) in the form of graphs, represent relationships between these entities and make the facts understandable by means of ontologies. This thesis considers knowledge graphs from three different perspectives: (i) Creation of knowledge graphs: Even though the Web offers a multitude of sources that provide knowledge about the events in the world, the creation of an event-centric knowledge graph requires recognition of such knowledge, its integration across sources and its representation. (ii) Knowledge graph enrichment: Knowledge of the world seems to be infinite, and it seems impossible to grasp it entirely at any time. Therefore, methods that autonomously infer new knowledge and enrich the knowledge graphs are of particular interest. (iii) Knowledge graph interaction: Even having all knowledge of the world available does not have any value in itself; in fact, there is a need to make it accessible to humans. Based on knowledge graphs, systems can provide their knowledge with their users, even without demanding any conceptual understanding of knowledge graphs from them. For this to succeed, means for interaction with the knowledge are required, hiding the knowledge graph below the surface. In concrete terms, I present EventKG - a knowledge graph that represents the happenings in the world in 15 languages - as well as Tab2KG - a method for understanding tabular data and transforming it into a knowledge graph. For the enrichment of knowledge graphs without any background knowledge, I propose HapPenIng, which infers missing events from the descriptions of related events. I demonstrate means for interaction with knowledge graphs at the example of two web-based systems (EventKG+TL and EventKG+BT) that enable users to explore the happenings in the world as well as the most relevant events in the lives of well-known personalities.Die Welt befindet sich im steten Wandel, und mit ihr das Wissen ĂŒber die Welt. Wissensbasierte Systeme - seien es Online-EnzyklopĂ€dien, Suchmaschinen oder Sprachassistenten - stehen somit vor der konstanten Herausforderung, dieses Wissen zu sammeln und darĂŒber hinaus zu verstehen, um es so Menschen verfĂŒgbar zu machen. Nur wenn ein wissensbasiertes System in der Lage ist, dieses VerstĂ€ndnis aufzubringen - also zu mehr in der Lage ist, als auf eine unsortierte Ansammlung von Wörtern und Zahlen zurĂŒckzugreifen, ohne deren Bedeutung zu erkennen -, kann es relevante Informationen erkennen und diese seinen Nutzern verstĂ€ndlich machen. Eine besondere Rolle spielt hierbei die Dynamik der Welt, die von Ereignissen unterschiedlichster Art geformt wird, die fĂŒr unterschiedlichste Bevölkerungsgruppe relevant sind; Beispiele hierfĂŒr erstrecken sich von der Corona-Pandemie bis hin zu den Spielen lokaler Fußballvereine. Doch stellen sich hierbei bedeutende Fragen: Wie wird die Entscheidung getroffen, ob und fĂŒr wen derlei Ereignisse relevant sind? Wie sind diese Ereignisse zu modellieren, um von wissensbasierten Systemen verstanden zu werden? Wie wird das angeeignete Wissen an die Nutzer dieser Systeme zurĂŒckgegeben? Ein bewĂ€hrtes Konzept, um wissensbasierten Systemen das Wissen verstĂ€ndlich zu machen, sind Wissensgraphen, die Fakten ĂŒber EntitĂ€ten (Personen, Objekte, Orte, ...) in der Form von Graphen sammeln, ZusammenhĂ€nge zwischen diesen EntitĂ€ten darstellen, und darĂŒber hinaus anhand von Ontologien verstĂ€ndlich machen. Diese Arbeit widmet sich der Betrachtung von Wissensgraphen aus drei aufeinander aufbauenden Blickwinkeln: (i) Erstellung von Wissensgraphen: Auch wenn das Internet eine Vielzahl an Quellen anbietet, die Wissen ĂŒber Ereignisse in der Welt bereithalten, so erfordert die Erstellung eines ereigniszentrierten Wissensgraphen, dieses Wissen zu erkennen, miteinander zu verbinden und zu reprĂ€sentieren. (ii) Anreicherung von Wissensgraphen: Das Wissen ĂŒber die Welt scheint schier unendlich und so scheint es unmöglich, dieses je vollstĂ€ndig (be)greifen zu können. Von Interesse sind also Methoden, die selbststĂ€ndig das vorhandene Wissen erweitern. (iii) Interaktion mit Wissensgraphen: Selbst alles Wissen der Welt bereitzuhalten, hat noch keinen Wert in sich selbst, vielmehr muss dieses Wissen Menschen verfĂŒgbar gemacht werden. Basierend auf Wissensgraphen, können wissensbasierte Systeme Nutzern ihr Wissen darlegen, auch ohne von diesen ein konzeptuelles VerstĂ€ndis von Wissensgraphen abzuverlangen. Damit dies gelingt, sind Möglichkeiten der Interaktion mit dem gebotenen Wissen vonnöten, die den genutzten Wissensgraphen unter der OberflĂ€che verstecken. Konkret prĂ€sentiere ich EventKG - einen Wissensgraphen, der Ereignisse in der Welt reprĂ€sentiert und in 15 Sprachen verfĂŒgbar macht, sowie Tab2KG - eine Methode, um in Tabellen enthaltene Daten anhand von Hintergrundwissen zu verstehen und in Wissensgraphen zu wandeln. Zur Anreicherung von Wissensgraphen ohne weiteres Hintergrundwissen stelle ich HapPenIng vor, das fehlende Ereignisse aus den vorliegenden Beschreibungen Ă€hnlicher Ereignisse inferiert. Interaktionsmöglichkeiten mit Wissensgraphen demonstriere ich anhand zweier web-basierter Systeme (EventKG+TL und EventKG+BT), die Nutzern auf einfache Weise die Exploration von Geschehnissen in der Welt sowie der wichtigsten Ereignisse in den Leben bekannter Persönlichkeiten ermöglichen

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Large-scale data harvesting for biographical data

    Get PDF
    This paper explores automatic methods to identify relevant biography candidates in large databases, and extract biographical information from encyclopedia entries and databases. In this work, relevant candidates are defined as people who have made an impact in a certain country or region within a pre-defined time frame. We investigate the case of people who had an impact in the Republic of Austria and died between 1951 and 2019. We use Wikipedia and Wikidata as data sources and compare the performance of our information extraction methods on these two databases. We demonstrate the usefulness of a natural language processing pipeline to identify suitable biography candidates and, in a second stage, extract relevant information about them. Even though they are considered by many as an identical resource, our results show that the data from Wikipedia and Wikidata differs in some cases and they can be used in a complementary way providing more data for the compilation of biographies

    Recognizing meaning: semiotics in entrepreneurial research.

    Get PDF
    Entrepreneurship is a process which involves discontinuity and change; entrepreneurs create disequilibria and exploit the resulting change. Thus, entrepreneurship is in essence change. This fundamental characteristic of entrepreneurship makes it difficult to pin down or even to categorise. But certain aspects of entrepreneurial change remain similar through space and time, so that the exploration of the signs and symbolism of enterprise can provide us with the tools to picture a continuity of meaning. Semiotics, the doctrine of signs, is a useful tool for exploring the depth and scope of what we mean by entrepreneurship. Consequently this chapter argues that an appreciation of entrepreneurial semiotics enables an understanding of the meanings of enterprise; what it is; how it is practised; why it is practised and why it is encouraged. Many of these meanings lie at the ideological level, they are taken for granted, often implicit, rarely explicit, but analysis of entrepreneurial symbolism gives us some purchase in understanding. By reading and analysis, the decoding of signifiers enables us to get beneath the taken-for-granted iconographic, to begin to understand the nature of entrepreneurial meanings

    An auto/biographical, cooperative study of ourrelationships to knowing

    Get PDF
    In this thesis, I explore the relationship between knowing and self-construction among education professionals. The work addresses questions about our relationship with different ways of knowing; and within what I term a psychosocial framework, how the road to selfhood may lie in integrating different ways of knowing, including the rational, emotional, imaginal, embodied, creative, and spiritual. It also questions the tendency to idealize ‘experts’ and disembodied forms of knowledge that are widespread in (higher) education, and even in social and therapeutic work. Auto/biographically oriented co-operative inquiry was my chosen methodology. The research involved two groups of co-researchers based in two different countries, and included interviews with members of my own family. Exploration of my own reflexive relationship with my object of study shaped it into a quest for meaning and voice. I composed a multi-layered, multimedia, performative and circular textual understanding via processes of ‘spiralling’ and unfolding that were solidly rooted in a constructivist epistemology. I analysed both individual and group processes in the co-operative inquiry, looking at metaphors and engaging with crises of knowing and self to produce a fresh perspective on transformative research and professional becoming. I also drew on the ‘writing as inquiry’ approach to intertwine myself as knower with my interpretation, thus constantly interrogating the role of prose and poetic writing in pursuing authenticity and selfhood in relation to knowledge. In addition, I explored the evocative use of ‘cultural objects’ as a strategy for integrating subjective and objective sources of knowing. I conclude my dissertation by offering what has provisionally become – for me as author – a satisfying theory. Taking a view of the self as contingent, developmental and potentially agentic, I claim that by engaging more holistically with feeling, emotion, intuition, imagination and intellect, we may come to experience ourselves as more ‘real’ and integrated knowers

    Transfer learning: bridging the gap between deep learning and domain-specific text mining

    Get PDF
    Inspired by the success of deep learning techniques in Natural Language Processing (NLP), this dissertation tackles the domain-specific text mining problems for which the generic deep learning approaches would fail. More specifically, the domain-specific problems are: (1) success prediction in crowdfunding, (2) variants identification in biomedical literature, and (3) text data augmentation for domains with low-resources. In the first part, transfer learning in a multimodal perspective is utilized to facilitate solving the project success prediction on the crowdfunding application. Even though the information in a project profile can be of different modalities such as text, images, and metadata, most existing prediction approaches leverage only the text modality. It is promising to utilize the visual images in project profiles to find out how images could contribute to the success prediction. An advanced neural network scheme is designed and evaluated combining information learned from different modalities for project success prediction. In the second part, transfer learning is combined with deep learning techniques to solve genomic variants Named Entity Recognition (NER) problems in biomedical literature. Most of the advanced generic NER algorithms can fail due to the restricted training corpus. However, those generic deep learning algorithms are capable of learning from a canonical corpus, without any effort on feature engineering. This work aims to build an end-to-end deep learning approach to transfer the domain-specific knowledge to those advanced generic NER algorithms, addressing the challenges in low-resource training and requiring neither hand-crafted features nor post-processing rules. For the last part, transfer learning with knowledge distillation and active learning are utilized to solve text augmentation for domains with low-resources. Most of the recent text augmentation methods heavily rely on large external resources. This work is dedicates to solving the text augmentation problem adaptively and consistently with minimal resources for token-level tasks like NER. The solution can also assure the reliability of machine labels for noisy data and can enhance training consistency with noisy labels. All the works are evaluated on different domain-specific benchmarks, respectively. Experimental results demonstrate the effectiveness of those proposed methods. The advantages also indicate promising potential for transfer learning in domain-specific applications
    • 

    corecore