36 research outputs found

    From Text to Knowledge with Graphs: modelling, querying and exploiting textual content

    Full text link
    This paper highlights the challenges, current trends, and open issues related to the representation, querying and analytics of content extracted from texts. The internet contains vast text-based information on various subjects, including commercial documents, medical records, scientific experiments, engineering tests, and events that impact urban and natural environments. Extracting knowledge from this text involves understanding the nuances of natural language and accurately representing the content without losing information. This allows knowledge to be accessed, inferred, or discovered. To achieve this, combining results from various fields, such as linguistics, natural language processing, knowledge representation, data storage, querying, and analytics, is necessary. The vision in this paper is that graphs can be a well-suited text content representation once annotated and the right querying and analytics techniques are applied. This paper discusses this hypothesis from the perspective of linguistics, natural language processing, graph models and databases and artificial intelligence provided by the panellists of the DOING session in the MADICS Symposium 2022

    Databases and Information Systems in the AI Era: Contributions from ADBIS, TPDL and EDA 2020 Workshops and Doctoral Consortium

    Get PDF
    Research on database and information technologies has been rapidly evolving over the last couple of years. This evolution was lead by three major forces: Big Data, AI and Connected World that open the door to innovative research directions and challenges, yet exploiting four main areas: (i) computational and storage resource modeling and organization; (ii) new programming models, (iii) processing power and (iv) new applications that emerge related to health, environment, education, Cultural Heritage, Banking, etc. The 24th East-European Conference on Advances in Databases and Information Systems (ADBIS 2020), the 24th International Conference on Theory and Practice of Digital Libraries (TPDL 2020) and the 16th Workshop on Business Intelligence and Big Data (EDA 2020), held during August 25–27, 2020, at Lyon, France, and associated satellite events aimed at covering some emerging issues related to database and information system research in these areas. The aim of this paper is to present such events, their motivations, and topics of interest, as well as briefly outline the papers selected for presentations. The selected papers will then be included in the remainder of this volume

    Aspects dynamiques de XML et spécification des interfaces de services web avec PEWS

    Get PDF
    Nous nous intéressons par le problème de la sémantique des mises à jour et de la cohérence des bases de données dans différents contextes comme les documents XML et les services web. En effet, des difficultés particulières sont à prévoir lors de la mise à jour d'une base ayant des contraintes à respecter, car, des données originalement cohérentes par rapport aux contraintes peuvent devenir incohérentes suite aux mises à jour. Dans une première partie de notre travail, nous considérons la mise à jour et le maintien de la cohérence d'une base de données XML par rapport au type (ou schéma) ainsi que par rapport aux contraintes d'intégrité. Nous abordons ce problème de différentes manières. Tout d'abord, nous proposons une procédure de validation incrémentale par rapport aux contraintes, évitant de revalider les parties du document qui n'ont pas été touchées par les mises à jour. Cette approche traite aussi bien le cas des contraintes de schéma que le cas des contraintes d'intégrité. Dans ce cadre, les listes de mises à jour qui violent la validité sont rejetées. Quand la validation incrémentale échoue, c'est-à-dire, quand une mise à jour viole le type, deux propositions de traitement sont faites\,: (A) Une routine de correction est activée pour adapter le document XML au type tout en prenant en compte la mise à jour. La mise à jour a donc une priorité par rapport aux données déjà stockées. (B) Une routine propose une adaptation du type du document, de façon à accepter le document mis à jour en préservant la validité des autres documents originalement valides et non soumis à la mise à jour. Dans ce cas, la mise à jour est prioritaire et les contraintes peuvent être modifiées. Une deuxième partie du travail considère la construction d'une plate-forme d'aide à la spécification, à l'implémentation et à la manipulation de services. L'idée de pouvoir spécifier et modifier des compositions de services nous a amené à la définition du langage PEWS (\textit{Path Expressions for Web Services}), ayant une sémantique formelle bien définie et permettant la spécification du comportement des interfaces des services web simples ou composés. Pour pouvoir tester statiquement des propriétés liées à la composition des services, nous proposons l'utilisation de la théorie des traces et les graphes de dépendances

    DOING : Intelligent Data – from data to knowledge (workshop of ADBIS 2021): Springer Communications in Computer and Information Science 1450

    No full text
    International audienceTexts are important sources of information and communication in diverse domains. The intelligent, efficient, and secure use of this information requires, in most cases, the transformation of unstructured textual data into data sets with some structure, organized according to an appropriate schema that follows the semantics of an application domain. Indeed, solving the problems of modern society requires interdisci-plinary research and information cross-referencing, thus surpassing the simple provision of unstructured data. There is a need for representations that are more flexible, subtle, and context-sensitive, which can also be easily accessible via consultation tools and evolve according to these principles. In this context, consultation requires robust and efficient processing of queries, which may involve information analysis, quality assessments, consistency checking, and privacy preservation. Knowledge bases can be built as new generation infrastructures to support data science queries with a user-friendly framework. They can provide the required machinery for advised decision-making.The 2nd Workshop on Intelligent Data - From Data to Knowledge (DOING 2021) focused on transforming data into information and then into knowledge. The workshop gathers researchers from natural language processing (NLP), databases (DB), and artificial intelligence (AI). This edition featured works in two main areas: (1) information extraction from textual data and its representation on knowledge bases and (2) intelligent methods for handling and maintaining these databases. Overall, the purpose of the workshop was to focus on all aspects concerning modern infrastructures to support these areas, giving particular, but not sole, attention to data on health and environmental domains.DOING 2021 received nine submissions, out of which three were accepted as full papers and three as short papers, resulting in an acceptance rate of 50%. Each paper received three reviews from members of the Program Committee.This workshop is an event supported by the French network MADICS1. More specifically, it is an event of the action DOING2 within MADICS and of the DOING working group in the regional network DIAMS3

    DOING : Intelligent Data – From Data to Knowledge WORKSHOP in ADBIS, TPDL & EDA 2020 joint conferences: Springer - Communications in Computer and Information Science 1260 - ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium

    No full text
    International audienceThe First Workshop on Intelligent Data - From Data to Knowledge (DOING 2020) focuses on transforming data into information and then into knowledge. It gathered researchers from natural language processing (NLP), databases, and AI. DOING 2020 focuses on all aspects concerning modern infrastructures that support these areas, giving particular attention, but not limited to, data on health and environmental domains. The DOING workshop received 17 submissions, out of which 8 were accepted as full papers and 1 as a short paper, resulting in an acceptance rate of 50%. The workshop program also featured an invited keynote talk by Professor Marie- Christine Rousset, from the Laboratoire d’Informatique de Grenoble (LIG), France

    Consistent updating of databases with marked nulls

    No full text
    International audienc

    Querying Semantic Graph Databases in View of Constraints and Provenance

    No full text
    This paper focus on query semantics on a graph database with constraints defined over a global schema and confidence scores assigned to the local data sources (which compose a distributed graph instance). Our query environment is customizable and user-friendly: it is settled on the basis of data source confidence and user's quality restrictions. These constraints are on queries not on sources. Inconsistent data is filtered, allowing the production of valid results even when data quality cannot be entirely ensured by sources. We elaborate on the process of validating answers to the queries against positive, negative and key constraints. Our validator can interact with divers lower-level query evaluation methods

    Urban Graph Analysis on a Context-driven Querying System

    No full text
    Urban computing intends to provide guidance to solve problems in big cities such as pollution, energy consumption, and human mobility. These are, however, difficult problems with several variables interacting in complex patterns. Such complex systems are often represented as graphs, taking advantage of the flexibility of the model and the availability of network science tools for analysis. In this context, expressive query languages capable of tackling graph analysis problems on a customized context are essential. This paper presents a context-driven query system for urban computing where users are responsible for defining their own restrictions over which datalog-like queries are built. Instead of imposing constraints on databases, our goal is to filter consistent data during the query process. Our query language is able to express aggregates in recursive rules, allowing to it capture network properties typical of graph analysis. This paper presents our query system and analyzes its capabilities using use cases in Urban Computing
    corecore