57 research outputs found

    Automatic Annotation Service APPI : Named Entity Linking in Legal Domain

    Get PDF
    Texts referencing court decisions and statutes can be difficult to understand without context. It can be time consuming and expensive to find related statutes or to learn about context specific terminology. As a solution, we utilized a named entity linking tool for extracting information and tailored it into a service, Appi, that can automatically annotate legal documents to provide context to the readers. The service can identify and link named entities and references to legal texts to corresponding vocabularies and data sources by combining statistics- and rule-based named entity recognition with named entity linking. The results provide users with enhanced reading experience with contextual information and the possibility to access related materials, such as statutes and court decisions.Peer reviewe

    Analyzing biography collections historiographically as Linked Data : Case National Biography of Finland

    Get PDF
    Biographical collections are available on the Web for close reading. However, the underlying texts can also be used for data analysis and distant reading, if the documents are available as data. Such data is usable for creating intelligent user interfaces to biographical data, including Digital Humanities tooling for visualizations, data analysis, and knowledge discovery in biographical and prosopographical research. In this paper, we re-use biographical collection data from a historiographical perspective for analyzing the underlying collection. For example: What kind of people have been included in the collection? Does the language used for describing female biographees differ from that for men? As a case study, the Finnish National Biography, available as part of the Linked Open Data service and semantic portal BiographySampo - Finnish Biographies on the Semantic Web is used. The analyses show interesting results related to, e.g., how specific prosopographical groups, such as women or professional groups are represented and portrayed. Various novel statistics and network analyses of the biographees are presented. Our analyses give new insights to the editors of the National Biography as well as to researchers in biography, prosopography, and historiography. The presented approach can be applied also to similar biography collections in other countries.Peer reviewe

    Extracting Knowledge from Parliamentary Debates for Studying Political Culture and Language

    Get PDF
    Publisher Copyright: © 2022 Copyright for this paper by its authors.This paper presents knowledge extraction and natural language processing methods used to enrich the knowledge graph of the plenary debates (textual transcripts of speeches) of the Parliament of Finland. This knowledge graph includes some 960 000 speeches (1907–2021) interlinked with a prosopographical knowledge graph about the politicians. A recent subset of the speeches was used to extract named entities and topical keywords for semantic searching and browsing the data and for data analysis. The process is based on linguistic analysis, named entity linking, and automatic subject indexing. The results were included into the ParliamentSampo knowledge graph in a SPARQL endpoint. This data can be used for studying parliamentary language and culture in Digital Humanities research and for developing applications, such as the ParliamentSampo portal.Peer reviewe

    WarMemoirSampo : A Semantic Portal for War Veteran Interview Videos

    Get PDF
    Publisher Copyright: © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)This paper presents WarMemoirSampo, a portal that provides semantic search and navigation of video interviews with Finnish World War II veterans. The portal associates video fragments with contextual data extracted from the video transcriptions, enabling users to find suitable video segments via faceted search and highlighting relevant content in the video being watched. This is carried out by processing natural language texts in order to extract named entities, keywords and lemmas. The result is a Linked Data Knowledge Graph that underpins the portal. We describe the collaboration between Natural Language Processing and Semantic Web technologies used in order to produce these results.Peer reviewe
    corecore