200 research outputs found

    Computational and human-based methods for knowledge discovery over knowledge graphs

    Get PDF
    The modern world has evolved, accompanied by the huge exploitation of data and information. Daily, increasing volumes of data from various sources and formats are stored, resulting in a challenging strategy to manage and integrate them to discover new knowledge. The appropriate use of data in various sectors of society, such as education, healthcare, e-commerce, and industry, provides advantages for decision support in these areas. However, knowledge discovery becomes challenging since data may come from heterogeneous sources with important information hidden. Thus, new approaches that adapt to the new challenges of knowledge discovery in such heterogeneous data environments are required. The semantic web and knowledge graphs (KGs) are becoming increasingly relevant on the road to knowledge discovery. This thesis tackles the problem of knowledge discovery over KGs built from heterogeneous data sources. We provide a neuro-symbolic artificial intelligence system that integrates symbolic and sub-symbolic frameworks to exploit the semantics encoded in a KG and its structure. The symbolic system relies on existing approaches of deductive databases to make explicit, implicit knowledge encoded in a KG. The proposed deductive database DSDS can derive new statements to ego networks given an abstract target prediction. Thus, DSDS minimizes data sparsity in KGs. In addition, a sub-symbolic system relies on knowledge graph embedding (KGE) models. KGE models are commonly applied in the KG completion task to represent entities in a KG in a low-dimensional vector space. However, KGE models are known to suffer from data sparsity, and a symbolic system assists in overcoming this fact. The proposed approach discovers knowledge given a target prediction in a KG and extracts unknown implicit information related to the target prediction. As a proof of concept, we have implemented the neuro-symbolic system on top of a KG for lung cancer to predict polypharmacy treatment effectiveness. The symbolic system implements a deductive system to deduce pharmacokinetic drug-drug interactions encoded in a set of rules through the Datalog program. Additionally, the sub-symbolic system predicts treatment effectiveness using a KGE model, which preserves the KG structure. An ablation study on the components of our approach is conducted, considering state-of-the-art KGE methods. The observed results provide evidence for the benefits of the neuro-symbolic integration of our approach, where the neuro-symbolic system for an abstract target prediction exhibits improved results. The enhancement of the results occurs because the symbolic system increases the prediction capacity of the sub-symbolic system. Moreover, the proposed neuro-symbolic artificial intelligence system in Industry 4.0 (I4.0) is evaluated, demonstrating its effectiveness in determining relatedness among standards and analyzing their properties to detect unknown relations in the I4.0KG. The results achieved allow us to conclude that the proposed neuro-symbolic approach for an abstract target prediction improves the prediction capability of KGE models by minimizing data sparsity in KGs

    Knowledge extraction from unstructured data

    Get PDF
    Data availability is becoming more essential, considering the current growth of web-based data. The data available on the web are represented as unstructured, semi-structured, or structured data. In order to make the web-based data available for several Natural Language Processing or Data Mining tasks, the data needs to be presented as machine-readable data in a structured format. Thus, techniques for addressing the problem of capturing knowledge from unstructured data sources are needed. Knowledge extraction methods are used by the research communities to address this problem; methods that are able to capture knowledge in a natural language text and map the extracted knowledge to existing knowledge presented in knowledge graphs (KGs). These knowledge extraction methods include Named-entity recognition, Named-entity Disambiguation, Relation Recognition, and Relation Linking. This thesis addresses the problem of extracting knowledge over unstructured data and discovering patterns in the extracted knowledge. We devise a rule-based approach for entity and relation recognition and linking. The defined approach effectively maps entities and relations within a text to their resources in a target KG. Additionally, it overcomes the challenges of recognizing and linking entities and relations to a specific KG by employing devised catalogs of linguistic and domain-specific rules that state the criteria to recognize entities in a sentence of a particular language, and a deductive database that encodes knowledge in community-maintained KGs. Moreover, we define a Neuro-symbolic approach for the tasks of knowledge extraction in encyclopedic and domain-specific domains; it combines symbolic and sub-symbolic components to overcome the challenges of entity recognition and linking and the limitation of the availability of training data while maintaining the accuracy of recognizing and linking entities. Additionally, we present a context-aware framework for unveiling semantically related posts in a corpus; it is a knowledge-driven framework that retrieves associated posts effectively. We cast the problem of unveiling semantically related posts in a corpus into the Vertex Coloring Problem. We evaluate the performance of our techniques on several benchmarks related to various domains for knowledge extraction tasks. Furthermore, we apply these methods in real-world scenarios from national and international projects. The outcomes show that our techniques are able to effectively extract knowledge encoded in unstructured data and discover patterns over the extracted knowledge presented as machine-readable data. More importantly, the evaluation results provide evidence to the effectiveness of combining the reasoning capacity of the symbolic frameworks with the power of pattern recognition and classification of sub-symbolic models

    Trust, Accountability, and Autonomy in Knowledge Graph-based AI for Self-determination

    Full text link
    Knowledge Graphs (KGs) have emerged as fundamental platforms for powering intelligent decision-making and a wide range of Artificial Intelligence (AI) services across major corporations such as Google, Walmart, and AirBnb. KGs complement Machine Learning (ML) algorithms by providing data context and semantics, thereby enabling further inference and question-answering capabilities. The integration of KGs with neuronal learning (e.g., Large Language Models (LLMs)) is currently a topic of active research, commonly named neuro-symbolic AI. Despite the numerous benefits that can be accomplished with KG-based AI, its growing ubiquity within online services may result in the loss of self-determination for citizens as a fundamental societal issue. The more we rely on these technologies, which are often centralised, the less citizens will be able to determine their own destinies. To counter this threat, AI regulation, such as the European Union (EU) AI Act, is being proposed in certain regions. The regulation sets what technologists need to do, leading to questions concerning: How can the output of AI systems be trusted? What is needed to ensure that the data fuelling and the inner workings of these artefacts are transparent? How can AI be made accountable for its decision-making? This paper conceptualises the foundational topics and research pillars to support KG-based AI for self-determination. Drawing upon this conceptual framework, challenges and opportunities for citizen self-determination are illustrated and analysed in a real-world scenario. As a result, we propose a research agenda aimed at accomplishing the recommended objectives

    Building Blocks for IoT Analytics Internet-of-Things Analytics

    Get PDF
    Internet-of-Things (IoT) Analytics are an integral element of most IoT applications, as it provides the means to extract knowledge, drive actuation services and optimize decision making. IoT analytics will be a major contributor to IoT business value in the coming years, as it will enable organizations to process and fully leverage large amounts of IoT data, which are nowadays largely underutilized. The Building Blocks of IoT Analytics is devoted to the presentation the main technology building blocks that comprise advanced IoT analytics systems. It introduces IoT analytics as a special case of BigData analytics and accordingly presents leading edge technologies that can be deployed in order to successfully confront the main challenges of IoT analytics applications. Special emphasis is paid in the presentation of technologies for IoT streaming and semantic interoperability across diverse IoT streams. Furthermore, the role of cloud computing and BigData technologies in IoT analytics are presented, along with practical tools for implementing, deploying and operating non-trivial IoT applications. Along with the main building blocks of IoT analytics systems and applications, the book presents a series of practical applications, which illustrate the use of these technologies in the scope of pragmatic applications. Technical topics discussed in the book include: Cloud Computing and BigData for IoT analyticsSearching the Internet of ThingsDevelopment Tools for IoT Analytics ApplicationsIoT Analytics-as-a-ServiceSemantic Modelling and Reasoning for IoT AnalyticsIoT analytics for Smart BuildingsIoT analytics for Smart CitiesOperationalization of IoT analyticsEthical aspects of IoT analyticsThis book contains both research oriented and applied articles on IoT analytics, including several articles reflecting work undertaken in the scope of recent European Commission funded projects in the scope of the FP7 and H2020 programmes. These articles present results of these projects on IoT analytics platforms and applications. Even though several articles have been contributed by different authors, they are structured in a well thought order that facilitates the reader either to follow the evolution of the book or to focus on specific topics depending on his/her background and interest in IoT and IoT analytics technologies. The compilation of these articles in this edited volume has been largely motivated by the close collaboration of the co-authors in the scope of working groups and IoT events organized by the Internet-of-Things Research Cluster (IERC), which is currently a part of EU's Alliance for Internet of Things Innovation (AIOTI)

    Privacy-Preserving Ontology Publishing:: The Case of Quantified ABoxes w.r.t. a Static Cycle-Restricted EL TBox: Extended Version

    Get PDF
    We review our recent work on how to compute optimal repairs, optimal compliant anonymizations, and optimal safe anonymizations of ABoxes containing possibly anonymized individuals. The results can be used both to remove erroneous consequences from a knowledge base and to hide secret information before publication of the knowledge base, while keeping as much as possible of the original information.Updated on August 27, 2021. This is an extended version of an article accepted at DL 2021

    Desarrollo de un modelo ontológico para la integración de datos de una unidad de climatización en un edificio inteligente

    Get PDF
    El crecimiento de las tecnologías de IoT aplicadas en todo tipo de situaciones y procesos de la vida cotidiana ha incrementado la cantidad de datos que generamos y almacenamos. Englobar los sensores que conforman estructuras para su gestión eficiente, como puede ser un edificio inteligente (Smart Building), e integrar este desarrollo en las agendas de los gobiernos para crear las llamadas ciudades inteligentes (Smart Cities) ha implicado a multitud de organizaciones públicas y privadas. Esta escalada de datos ha generado múltiples paradigmas y formatos dependientes de cada impulsor del proyecto. Por otro lado, el paradigma energético actual y la crisis climática, hacen que la eficiencia energética y la reducción de emisiones de CO2 aumenten de escala en un esfuerzo por aumentar la eficiencia. Organismos oficiales como la Unión Europea han creado comités y estándares para abordar el problema. El análisis energético del consumo de un edificio, expone que el mayor porcentaje de uso se dedica a la climatización, superando incluso cuotas del 40% sobre el total. Una mala gestión de los múltiples sistemas implicados genera costes energéticos y mayores emisiones de CO2 , por lo que son de gran interés los datos que puede proporcionar un Smart Building a la hora de realizar estudios sobre gestión eficiente. El objetivo de este Trabajo Fin de Grado es valorar la adaptación de un entorno de procesamiento de datos IoT basado en datos tabulados, a las tecnologías de procesamiento de información basadas en el uso de ontologías. El uso de ontologías garantiza la independencia entre los datos y su formato (representación sintáctica) mediante el uso de una semántica común compartida con una amplia comunidad. Esto permite una fácil compatibilidad e integración entre plataformas o programas externos que posibilita, entre otras cosas, interactuar con nuevos elementos ajenos al dominio inicial. Otra ventaja la encontramos en la posibilidad que ofrece esta tecnología para obtener nuevo conocimiento aplicando inferencia sobre los datos, lo que amplía la capacidad de actuación del sistema frente a los tratamientos tradicionales sobre datos tabulados. La fuente de datos tomados como referencia para el proceso de adaptación pertenecen a la AHU101 (Unidad de Climatización número 101) del Alice Perry Engineering Building situado en Galway, Irlanda. Se fundamentará y analizará el estándar ISO SAREF (la ontología Smart Applications REFerence), su variante para Smart Building (SAREF4BLDN) y otros estándares relacionados como OWL-Time. Se modelizarán y describirán mediante el estándar OWL (variante OWL2-DL) y se comprobará su validez para obtener consultas y estadísticas descriptivas básicas mediante entornos gráficos.The boosting of IoT technologies applied to all kinds of issues and processes in regular life has increased the data amount generated and stored. Involving sensors that conform structures, like a building, for example, working out its efficient management (Smart Building), and leading it into the government agendas to develop Smart Cities has implicated multitude organizations, public and private. This increasing amount of data has generated different paradigms and formats depending on the developer's approach. On the other hand, the current energy paradigm and the climate crisis make energy efficiency and CO2 emission reductions scale up in an effort to increase efficiency. Official organizations, like the European Union, have developed committees and standards to deal with the problem. The energy analysis of the consumption of a building show that the main percentage is dedicated to air conditioning, even exceeding 40% of the prices as a whole. Poor management of the multiple systems involved generates unnecessary energy costs and CO2 emissions, thus the data extracted from Smart Building can provide us with a lot to carry out efficient management studies. This Final Degree Project’s focus is to assess the fitting of an IoT data processing environment based on tabular data in to information processing technologies based on ontologies. Ontologies guarantees data independence from formats (syntactic representation) through a common semantic lay which shared with other knowledge areas, this causes integration and extends compatibility between platforms or external applications to allow, for example, interacting with new elements out from the initial domain. Another advantage is, in the possibility offered by this technology, to obtain new knowledge by applying data inference, this fact expands the system's capacity in comparation to traditional treatments of tabulated data The data source in reference for the process belongs to AHU101 (Air Conditioning Unit No. 101) located in Alice Perry Engineering Building located in Galway, Ireland. The ISO SAREF standard (the Smart Applications REFerence ontology), its variant for Smart Building (SAREF4BLDN) and other related standards such as OWLTime will be argued and analyzed. Data source will be modeled and described using the OWL standard (OWL2-DL variant) and checked its viability in order to obtain statistical analyzes using graphic environments.Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Grado en Ingeniería Informátic

    Automated Reasoning

    Get PDF
    This volume, LNAI 13385, constitutes the refereed proceedings of the 11th International Joint Conference on Automated Reasoning, IJCAR 2022, held in Haifa, Israel, in August 2022. The 32 full research papers and 9 short papers presented together with two invited talks were carefully reviewed and selected from 85 submissions. The papers focus on the following topics: Satisfiability, SMT Solving,Arithmetic; Calculi and Orderings; Knowledge Representation and Jutsification; Choices, Invariance, Substitutions and Formalization; Modal Logics; Proofs System and Proofs Search; Evolution, Termination and Decision Prolems. This is an open access book
    corecore