49 research outputs found

    EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

    Full text link
    Many question answering systems over knowledge graphs rely on entity and relation linking components in order to connect the natural language input to the underlying knowledge graph. Traditionally, entity linking and relation linking have been performed either as dependent sequential tasks or as independent parallel tasks. In this paper, we propose a framework called EARL, which performs entity linking and relation linking as a joint task. EARL implements two different solution strategies for which we provide a comparative analysis in this paper: The first strategy is a formalisation of the joint entity and relation linking tasks as an instance of the Generalised Travelling Salesman Problem (GTSP). In order to be computationally feasible, we employ approximate GTSP solvers. The second strategy uses machine learning in order to exploit the connection density between nodes in the knowledge graph. It relies on three base features and re-ranking steps in order to predict entities and relations. We compare the strategies and evaluate them on a dataset with 5000 questions. Both strategies significantly outperform the current state-of-the-art approaches for entity and relation linking.Comment: International Semantic Web Conference 201

    Towards an interoperable ecosystem of AI and LT platforms : a roadmap for the implementation of different levels of interoperability

    Get PDF
    With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows. We devise five different levels (of increasing complexity) of platform interoperability that we suggest to implement in a wider federation of AI/LT platforms. We illustrate the approach using the five emerging AI/LT platforms AI4EU, ELG, Lynx, QURATOR and SPEAKER

    Inverse modeling of particulate organic carbon fluxes in the South Atlantic

    Get PDF
    The biological production of particulate material near the ocean surface and the subsequent remineralization during sinking and after deposition on the seafloor strongly affect the distributions of oxygen, dissolved nutrients and carbon in the ocean. Dissolved nutrient distributions therefore reveal the underlying biogeochemical processes, and these data can be used to determine production-, remineralization and accumulation rates using inverse techniques. Here, an ocean circulation, biogeochemical model that exploits the existing large sets of hydrographic, oxygen, nutrient and carbon data is presented and results for the export production of particulate organic matter, vertical fluxes in the water column and sedimentation rates are presented. In the model, the integrated export flux of particulate organic carbon (POC) for the South Atlantic amounts to about 1300 Tg C yr-1 (equivalent to 1.3 Gt C yr-1), most of which occurring in the Benguela/Namibia upwelling region and in a zonal band following the course of the Antarctic Circumpolar Current (ACC). Remineralization of POC in the upper water column is intense, and only about 7% of the export reaches a depth of 2000 m. Comparison of model particle fluxes with sediment trap data suggests that shallow traps tend to underestimate the downward flux, whereas the deep traps seem to be affected by lateral input of material and apparently overestimate the vertical flux. These findings are consistent with recent radionuclide studies. The rapid degradation of POC with depth leads to geographical patterns of POC fluxes to the seafloor and POC accumulation in the sediment that are very different from the pattern of surface productivity, because of modulation with varying bottom depth. Whereas there is significant surface production in deep-water, open-ocean regions, the benthic fluxes occur predominantly in coastal and shelf areas

    Particle fluxes in the ocean: Comparison of sediment trap data with results from inverse modeling

    Get PDF
    Biological production lowers the CO_2 concentrations in the surfacelayer of the ocean, and sinking detritus ``pumps'' nutrients andCO_2 into the deep ocean. Quantifying the efficiency of thebiological pump is a prerequisite for global CO_2 budgets. Sedimenttraps are commonly used to directly measure the vertical particleflux, however, for logistical and financial reasons traps cannotprovide area-wide data sets. Moreover, it has been shownthat sediment traps can under- or overestimate particle fluxes considerably.In this paper we present a new technique to estimate the downward fluxof particulate matter with an adjoint model. Hydrographic and nutrientdata are used to calculate the mean ocean circulation together withparameters for particle fluxes using the AWI Adjoint Model for OceanicCarbon Cycling (AAMOCC). The model is fitted to the propertyconcentrations by systematically varying circulation, air-sea fluxes,export production and remineralization rates of particulate biogenicmatter simultaneously.The deviations of model fluxes based on nutrient budgets from directmeasurements with sediment traps yield an independent estimate ofapparent trapping efficiencies. While consistent with hydrographicand nutrient data, model particle fluxes rarely agree with sedimenttrap data: (1) At shallow water depth (< 1000m), sediment trapfluxes are at the average 50% lower than model fluxes, which confirmsflux calibrations using radionuclides; (2) in the very deep traps,model fluxes tend to be lower compared to data which might beexplained by lateral inputs into the traps. According to these modelresults, particle fluxes from the euphotic zone into mid water depthare considerably higher and the shallow loop of nutrient is morevigorous than would be derived fromsediment trap data.Our results imply that fluxes as collected with sediment traps areinconsistent with model derived long-term mean particle fluxes basedon nutrient budgets in the water column. In agreement with recentradionuclide studies we conclude that reliable export flux estimatescan only be obtained from sediment trap data if appropriatecorrections are applied

    8th challenge on question answering over linked data (QALD-8)

    No full text
    . The QALD-8 challenge focused on the successful and long running multilingual QA task. For the first time, the participating teams were required to provide webservices of their systems to participate in the challenge, which will in turn support comparable research in the future. In this challenge, we also changed the underlying evaluation platform to account for the need for comparable experiments via webservices in contrast to former XML/JSON file submissions. This increased the entrance requirements for participating teams but ensures long term comparability of the system performance and a fair and open challenge. In the future, we will further simplify the participation process and offer leaderboards prior to the actual challenge to allow participants to see their performance beforehand. After feedback from the authors, we will likely add new key performance indicators for the capability of a system to know which questions it cannot answer and take confidence scores for answers into account. Moreover, we will remove most of the curve ball questions to reflect the original character of the QALD challenge, which provides a clean and linguistically challenging benchmark

    FuhSen: A federated hybrid search engine for building a knowledge graph on-demand

    No full text
    A vast amount of information about various types of entities is spread across the Web, e.g., people or organizations on the Social Web, product offers on the Deep Web or on the Dark Web. These data sources can comprise heterogeneous data and are equipped with different search capabilities e.g., Search API. End users such as investigators from law enforcement institutions searching for traces and connections of organized crime have to deal with these interoperability problems not only during search time but also while merging data collected from different sources. We devise FuhSen, a keyword-based federated engine that exploits the search capabilities of heterogeneous sources during query processing and generates knowledge graphs on-demand applying an RDF-Molecule integration approach in response to keyword-based queries. The resulting knowledge graph describes the semantics of entities collected from the integrated sources, as well as relationships among these entities. Furthermore, FuhSen utilizes ontologies to describe the available sources in terms of content and search capabilities and exploits this knowledge to select the sources relevant for answering a keyword-based query. We conducted a user evaluation where FuhSen is compared to traditional search engines. FuhSen semantic search capabilities allow users to complete search tasks that could not be accomplished with traditional Web search engines during the evaluation study

    GSP (Geo-Semantic-Parsing): Geoparsing and Geotagging with Machine Learning on Top of Linked Data

    No full text
    Recently, user-generated content in social media opened up new alluring possibilities for understanding the geospatial aspects of many real-world phenomena. Yet, the vast majority of such content lacks explicit, structured geographic information. Here, we describe the design and implementation of a novel approach for associating geographic in- formation to text documents. GSP exploits powerful machine learning algorithms on top of the rich, interconnected Linked Data in order to overcome limitations of previous state-of-the-art approaches. In detail, our technique performs semantic annotation to identify relevant tokens in the input document, traverses a sub-graph of Linked Data for extract- ing possible geographic information related to the identified tokens and optimizes its results by means of a Support Vector Machine classifier. We compare our results with those of 4 state-of-the-art techniques and baselines on ground-truth data from 2 evaluation datasets. Our GSP tech- nique achieves excellent performances, with the best F 1 = 0.91, sensibly outperforming benchmark techniques that achieve F 1 ≤ 0.78
    corecore