15 research outputs found

    Prof. Ivan Batakliev – Builder of the Geographical Sciences in Bulgaria

    No full text
    Prof. Ivan Batakliev (1891-1973) is one of main persons in the process of foundation and basic development of geographical sciences in Bulgaria. His general scientific works are related with political geography, geopolitics and antropogeography in Bulgaria. He is founder of these scientific subjects and directions in Bulgaria. The work of prof. Batakliev is also connected with important contributions in physical geography, geography of population and settlements, economic geography, methodology of geographical education and regional geography as first landscape regionalization of Bulgaria (1934), first climate regionalization of Bulgaria (1941) and fundamental works for Pazardzhik town and the region of Pazardzhik (1922, 1923, 1969)

    Acad. Anastas Beshkov – Founder of the Regional Geographical Researches in Bulgaria

    No full text
    Acad. Anastas Beshkov (1896-1964) is one of the leading Bulgarian geographic researchers of the 20th century. His fundamental work is connected with development of regional geographical researches in Bulgaria and the first economic geographic regionalization of Bulgaria (1934). The scientific work and expertises of acad. Beshkov are related to important economic national projects as the importance of transport for development of the economy and settlements, decision of the transport problems in Dobrudzha, position of the factory for fertilizers near Stara Zagora town, an idea for the channel Varna- Devnya, developed in a project in 1965 and realized in 1975

    Academician Аnastas Ishirkov – A Life Dedicated to the Geographic Science and to Bulgaria

    No full text
    This article aims to fill the honorary place, which is laid on the founder of the Bulgarian Geographic Science, Acad. Anastas Ishirkov, in this collection of papers, but also to highlight some lesser known details of his research, socio-political and charitable activities. At the same time, the authors removed some inaccuracies in his biography accumulated over time and added new information about his life and activity. This was made possible by our research in the Archive of the Bulgarian Academy of Sciences, and from the materials for him in the Gipson Archive, as well as by the careful reading of the handwritten autobiography of Acad. Ishirkov

    Deliverable D4.9 Project logo, marketing starter pack and website running

    No full text
    The following report presents the initial project branding and marketing products that showcase the project’s visual identity and overall corporate appearance.As a foundation of the future effective communication activities, a sound set of working dissemination tools and materials is crucial to be established within the first months of the project. A project logo, project promotional materials, overall visual identity package, and a public website (www.showcase-project.eu) were developed in the first 4 months of the project duration in order to form the main tools of project public visibility and internal communication.The project is provided with a logo that has been communicated and coordinated with all project partners. Dissemination materials such as the SHOWCASE brochure and poster were produced for raising awareness and engaging stakeholders at events. A project brand manual was created and circulated among project partners in order to provide a consistent visual representation of the project. A set of corporate templates was also produced and made available to the consortium partners to facilitate future dissemination and reporting activities such as letters, milestones and deliverable reports, PowerPoint presentations, etc. The project website is developed as the main dissemination channel.The longer‐term impact of the project's results will be secured by maintaining the website for a minimum of 5 years after the end of the project

    The Pensoft Annotator: A new tool for text annotation with ontology terms

    No full text
    IntroductionDigitisation of biodiversity knowledge from collections, scholarly literature and various research documents is an ongoing mission of the Biodiversity Information Standards (TDWG) community. Organisations such as the Biodiversity Heritage Library make historical biodiversity literature openly available and develop tools to allow biodiversity data reuse and interoperability. For instance, Plazi transforms free text into machine-readable formats and extracts collection data and feeds it into the Global Biodiversity Information Facility (GBIF) and other aggregators. All of these digitisation workflows require a lot of effort to develop and implement in practice. In essence, what these digitisation activities entail are the mapping of free text to concepts from recognised vocabularies or ontologies in order to make the content understandable to computers. AimWe aim to address the problem of mapping free text to ontological terms ("strings to things") with our tool for text-to-ontology mapping: the Pensoft Annotator.Methods & ImplementationThe Annotator is a web application that performs direct text matching to terms from any ontology or vocabulary list given as input to the Annotator. The term 'ontology' is used loosely here and means a collection of terms and their synonyms, where terms are uniquely identified via a Uniform Resource Identifier (URI). The Annotator accepts any of the following ontology formats (e.g. OBO, OWL, RDF/XML, etc.) but does not require the existence of a proper ontology structure (logical statements). We use the ROBOT command line tool to convert any of these formats to JSON. After the upload of a new ontology, the Annotator processes the ontology terms by normalising all exact synonyms and by removing all of the other synonyms (related, narrow and broad synonyms). This is done to limit the number of false positive matches and to preserve the semantic similarity between the matched ontology term and the text.After matching the words in the input text and the ontology term labels, the Pensoft Annotator returns a table of matched ontology terms including the following fields: the identifier of the ontology term, the ontology term label or the label of the synonym, the starting position of the matched term in the text, the term context (words surrounding the matched term in the text), the type of ontology term (class or property), the ontology from which the matched term originates and the number of times a given term is mentioned in the text. The Pensoft Annotator allows simultaneous annotation with multiple ontologies. To better visualise the exact ontology from which a matching term has been found, the terms are highlighted in different colour depending on the ontology. The Pensoft Annotator is also accessible programmatically via an Application Programming Interface (API), documented at https://annotator.pensoft.net/api.Discussion & Use CasesThe Pensoft Annotator provides functionalities that will aid the transformation of free text to collections of semantic resources. However, it still requires expert knowledge to use as the ontologies need to be selected carefully. Some false positive matches from the annotation are possible because we do not perform semantic analysis of the texts. False negatives are also possible since there might be different word forms of ontology terms, which are not direct matches to them (e.g. 'wolf' and 'wolves'). For this reason, matched terms can be reviewed and removed from the results within the web interface of the Pensoft Annotator. After removal of terms, they will not be present in the downloaded results. The Pensoft Annotator can be used to annotate biodiversity and taxonomic literature to help with the extraction of biodiversity knowledge (e.g. species habitat preferences, species interaction data, localities, biogeographic data). The existence of some domain and taxon-specific ontologies, such as the Hymenoptera Anatomy Ontology, provides further opportunities for context-specific annotation. Semantic analysis of unstructured texts could be applied in addition to ontology annotation to improve the accuracy of ontology term matching and to filter out mismatched terms. Annotation of structured or semi-structured text (e.g. tables) can be done with better success. A recent example demonstrates the use of the Annotator to extract biotic interactions from tables (Dimitrova et al. 2020). The Annotator could also be used for ontology analysis and comparison. Annotation of text can help to discover gaps in ontologies as well as inaccurate synonyms. For instance, a certain word could be recognised as an ontology term match because it is an exact synonym in the ontology but in reality it might be more accurate to mark it as a related synonym. In addition, annotation with multiple ontologies can help to elucidate links between ontologies

    The OpenBiodiv Knowledge Graph Rebuilt: A semantic hub on top of the ARPHA-published content and the Biodiversity Literature Repository

    No full text
    OpenBiodiv is a complex ecosystem of tools and services for RDF conversion of XML narratives of biodiversity articles including Darwin Core data into Linked Open Data (LOD), running on top of a graph database. OpenBiodiv provides four main types of services:Searching named entities (e.g., taxon names, taxon concepts, treatments, specimens, occurrences, gene sequences, bibliographic information, institutions, persons) in context, within and between articles.Answering questions based on the presence of certain named entities within specific article sections (e.g., titles, abstracts, introduction or other sections, taxon treatments).Identifying article sections for further text processing (NLP) and providing contextual information, stored in MongoDB.Federating the SPARQL endpoint with other triple stores to enrich the discovered knowledge.Conversion of such data into RDF follows a general semantic model expressed in the OpenBiodiv-O ontology, an extension of the Treatment Ontology for knowledge representation of current and legacy biodiversity publications (Senderov et al. 2018) and uses two main sources, the full-text article XML published on the ARPHA Publishing Platform and the taxon treatments extracted by Plazi’s TreatmentBank from more than 100 biodiversity journals, stored in the Biodiversity Literature Repository at Zenodo. To ensure efficiency, quality control and fast tracking of all stages of the entire process of extraction, conversion to RDF and indexing of the content has been re-built on the Apache Kafka event streaming platform (Fig. 1). In this new format, OpenBiodiv provides not only a GraphDB SPARQL query endpoint but also indexes the named entities through Elasticsearch and additional provision of data to end users through a RESTful API and a number of user applications.OpenBiodiv is designed for a wide range of users who are interested in a deep-level bibliographic exploration, an ontology-linked search of various data elements (e.g., specimens, sequences, taxon concepts, persons), or co-existence of named entities (e.g., taxon names with a possible biotic relationships between them, or taxon names and potential habitats of occupation) in pre-defined sections of the articles. The SPARQL endpoint allows complex queries of various kinds (Dimitrova et al. 2021)

    Deliverable D6.4 Applications for interoperable access to OpenBiodiv through semantically enhanced queries

    No full text
    To the best of our knowledge, OpenBiodiv is the first production-stage  semantic  system running on top  of  a  reasonably-sized biodiversity knowledge graph. It stores biodiversity data in  a semantic interlinked format and offers facilities for working with it (Senderov et Penev 2016, Senderov et al.   2018, Penev et al. 2019, Dimitrova et al. 2021). It is a dynamic system that continuously updates its database as new biodiversity information becomes available by several international biodiversity publishers. It also allows its users to ask complex queries via SPARQL (a query language for semantic graph databases) and a simplified semantic search interface.OpenBiodiv was created during two EU-funded Marie Sklodowska-Curie PhD projects: BIG4 (Grant Agreement No 642241) and IGNITE (Grant Agreement No 764840). During those projects, the backend Ontology-0, the first versions of RDF converters and the basic website functionalities have been created (see Dimitrova et al. 2021 for overview).After the start of the BiCIKL project, the entire workflow for processing and RDF conversion of full-text articles in XML and Plazi’s treatments in XML has been re-built using up-to-date technological solutions (such as Apache Kaka  and  Elasticsearch)  to  fully  automatise  and speed up the conversion process and to make it trackable and efficient. As a result, the entire graph content has been re-processed and indexed. New user applications described  in Milestone MS27 App specifications have been discussed and implemented.The present deliverable describes the newly built workflow and tools for data extraction, conversion and indexing and the user applications, created in the BiCIKL project

    OpenBiodiv for Users: Applications and Approaches to Explore a Biodiversity Knowledge Graph

    No full text
    OpenBiodiv is a biodiversity database—knowledge graph based on Resource Description Framework (RDF)—that contains information extracted from the scientific literature. It provides access to an ecosystem of tools and services, including a Linked Open Dataset, an ontology (OpenBiodiv-O) and а website (Dimitrova et al. 2021).Using the available data, OpenBiodiv discovers links between various biodiversity data types (e.g., taxon names, treatments, specimens, sequences, people and institutions), to answer a user’s questions about specific taxa, scientific articles, materials examined and others.The full-text XML content is converted into Linked Open Data from journals on the ARPHA Publishing Platform and treatments extracted by Plazi’s TreatmentBank (stored in the Biodiversity Literature Repository at Zenodo). The database is updated and indexed daily using a workflow based on the Apache Kafka event-streaming platform. The workflow was developed during the European Union-funded Biodiversity Community Integrated Knowledge Library (BiCIKL) project (Penev et al. 2022b). By 1 of August 2023, the graph consisted of 24,939 articles; 167,471 treatments; 130,359 authors; 736,809 taxon names; 129,257 sequences; 1,390 institutions and collections, 117,854 figures; 18,585 tables, and 90,008 materials examined sections.Each semantic statement (e.g., authors, articles, treatments, taxonomic names, localities) has its own globally unique, persistent and resolvable identifier (GUPRI).There are four ways a user can explore the data on OpenBiodiv: General search The search engine is accessible from the OpenBiodiv homepage. The user needs to type in a key term, (e.g., a taxonomic name, authority or an article title), and the system retrieves information about it. Errors caused by misspellings are avoided due to the Elasticsearch index. It can also determine the semantic type of the searched entity.Application Programing Interface (API) OpenBiodiv can be used through a RESTful API for programmatic access. The documentation of the API is described on Swagger. The API construction and functionalities follow the recommendations elaborated by the Technical Research Infrastructures forum of the BiCIKL project (Addink et al. 2023).User applications based on a query algorithm This function can be applied for any data class. The method uses the relationships between an element type (e.g., taxon name) and the type of the section, where it can be found.An application example is Literature exploration, designed to answer the question: Give me information about X mentioned within article section type Y. The results show the number of mentions of the entity (e.g., taxon name) in the section(s) of interest (e.g., Title, Abstract, Treatment). A click navigates the user to the place in the article that mentions the item (Fig. 1).SPARQL queries in a thematic context OpenBiodiv provides a SPARQL endpoint through the Ontotext GraphDB solution*1. Several sample SPARQL queries*2 are also available on the OpenBiodiv website

    Infrastructure and Population of the OpenBiodiv Biodiversity Knowledge Graph

    No full text
    OpenBiodiv is a biodiversity knowledge graph containing a synthetic linked open dataset, OpenBiodiv-LOD, which combines knowledge extracted from academic literature with the taxonomic backbone used by the Global Biodiversity Information Facility. The linked open data is modelled according to the OpenBiodiv-O ontology integrating semantic resource types from recognised biodiversity and publishing ontologies with OpenBiodiv-O resource types, introduced to capture the semantics of resources not modelled before.We introduce the new release of the OpenBiodiv-LOD attained through information extraction and modelling of additional biodiversity entities. It was achieved by further developments to OpenBiodiv-O, the data storage infrastructure and the workflow and accompanying R software packages used for transformation of academic literature into Resource Description Framework (RDF). We discuss how to utilise the LOD in biodiversity informatics and give examples by providing solutions to several competency questions. We investigate performance issues that arise due to the large amount of inferred statements in the graph and conclude that OWL-full inference is impractical for the project and that unnecessary inference should be avoided

    OpenBiodiv: Linking Type Materials, Institutions, Locations and Taxonomic Names Extracted From Scholarly Literature

    No full text
    OpenBiodiv is a knowledge management system containing biodiversity knowledge extracted from scholarly literature: both recently published articles in Pensoft's journals and legacy (taxon treatments extracted by Plazi) (Senderov et al. 2017). OpenBiodiv advances our understanding of the use of scientific names, collection codes and institutions within published literature by using semantic technologies, such as the conversion of XML-encoded text to RDF triples, linked via the OpenBiodiv-O onthology (Senderov et al. 2018). In this poster, we show how OpenBiodiv, currently containing more than 729 million statements, can be used to address a specific use case: finding institutions storing type material specimens of the genus Prosopistoma from various literature sources (Fig. 1). This use case is important for various groups of users: institutions, taxonomists, and curators. Answering this complex question is made possible through the application of semantic technologies within OpenBiodiv. Data extraction from taxonomic articles and treatments is enabled the utilisation of common schemas and standards into the extraction process, whereas the conversion of XML-encoded scholarly literature into Resоurce Description Framework (RDF) is facilitated by OpenBiodiv-O. The code base for information extraction and data transformation is wrapped in the R packages rdf4r and ropenbio. The ontology allows to model the structure of research articles and treatments, as well as their corresponding metadata. Thus, OpenBiodiv-O is used to represent not only the sections of treatments but also the various entities within them, for instance geographic coordinates and institution codes within the “Type materials” section of a treatment. Institution codes marked up within articles using the Darwin Core standard (Wieczorek et al. 2012) are mapped to GRBio's institution records. Institutions which are not present in GRBio can often be extracted from the “Abbreviations” section of a given article, thus utilising the power of semantic publishing workflows to discover information hidden within scholarly literature (Penev et al. 2011, Agosti and Egloff 2009). Institutional codes (abbreviations) are then mapped to the narrative section, containing the type materials information. The extraction of coordinates in the taxonomic treatment section allows to establish the location of the collection event through reverse geocoding and enables the selection of treatments linked to a specific geographic region. Modelling of the “Nomenclature” section within OpenBiodiv-O helps to link taxonomic names, mapped to GBIF’s taxonomic backbone, to their type materials, thus facilitating the discovery of materials corresponding to species from a certain higher-rank taxon.
    corecore