57 research outputs found

    Named Entity Resolution in Personal Knowledge Graphs

    Full text link
    Entity Resolution (ER) is the problem of determining when two entities refer to the same underlying entity. The problem has been studied for over 50 years, and most recently, has taken on new importance in an era of large, heterogeneous 'knowledge graphs' published on the Web and used widely in domains as wide ranging as social media, e-commerce and search. This chapter will discuss the specific problem of named ER in the context of personal knowledge graphs (PKGs). We begin with a formal definition of the problem, and the components necessary for doing high-quality and efficient ER. We also discuss some challenges that are expected to arise for Web-scale data. Next, we provide a brief literature review, with a special focus on how existing techniques can potentially apply to PKGs. We conclude the chapter by covering some applications, as well as promising directions for future research.Comment: To appear as a book chapter by the same name in an upcoming (Oct. 2023) book `Personal Knowledge Graphs (PKGs): Methodology, tools and applications' edited by Tiwari et a

    Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions

    Full text link
    As systems based on opaque Artificial Intelligence (AI) continue to flourish in diverse real-world applications, understanding these black box models has become paramount. In response, Explainable AI (XAI) has emerged as a field of research with practical and ethical benefits across various domains. This paper not only highlights the advancements in XAI and its application in real-world scenarios but also addresses the ongoing challenges within XAI, emphasizing the need for broader perspectives and collaborative efforts. We bring together experts from diverse fields to identify open problems, striving to synchronize research agendas and accelerate XAI in practical applications. By fostering collaborative discussion and interdisciplinary cooperation, we aim to propel XAI forward, contributing to its continued success. Our goal is to put forward a comprehensive proposal for advancing XAI. To achieve this goal, we present a manifesto of 27 open problems categorized into nine categories. These challenges encapsulate the complexities and nuances of XAI and offer a road map for future research. For each problem, we provide promising research directions in the hope of harnessing the collective intelligence of interested stakeholders

    Heterogeneous data to knowledge graphs matching

    Get PDF
    Many applications rely on the existence of reusable data. The FAIR (Findability, Accessibility, Interoperability, and Reusability) principles identify detailed descriptions of data and metadata as the core ingredients for achieving reusability. However, creating descriptive data requires massive manual effort. One way to ensure that data is reusable is by integrating it into Knowledge Graphs (KGs). The semantic foundation of these graphs provides the necessary description for reuse. In the Open Research KG, they propose to model artifacts of scientific endeavors, including publications and their key messages. Datasets supporting these publications are essential carriers of scientific knowledge and should be included in KGs. We focus on biodiversity research as an example domain to develop and evaluate our approach. Biodiversity is the assortment of life on earth covering evolutionary, ecological, biological, and social forms. Understanding such a domain and its mechanisms is essential to preserving this vital foundation of human well-being. It is imperative to monitor the current state of biodiversity and its change over time and to understand its forces driving and preserving life in all its variety and richness. This need has resulted in numerous works being published in this field. For example, a large amount of tabular data (datasets), textual data (publications), and metadata (e.g., dataset description) have been generated. So, it is a data-rich domain with an exceptionally high need for data reuse. Managing and integrating these heterogeneous data of biodiversity research remains a big challenge. Our core research problem is how to enable the reusability of tabular data, which is one aspect of the FAIR data principles. In this thesis, we provide answer for this research problem

    Sicily and Crete between Byzantium and the Dar Al-Islam (Late 7th - Mid 10th Century): an archaeological contribution

    Get PDF
    Focusing on the Byzantine-Islamic transitions of Sicily and Crete, the aim of this thesis is to contribute to the archaeological debate surrounding the development of both islands between the late 7th and mid-10th century. Material sources have been the primary means of investigation, drawing especially on ceramic evidence, selected small finds, especially coins and lead seals, and relevant examples of built environment, which have encompassed a range of domestic, military, and religious contexts. Original arguments and data-collection have been produced through both revaluating the findings and conclusions of current secondary literature, and by drawing on first-hand studies of unpublished material sources and evidence documented during archive-based studies and field observations. Changing patterns in material culture and settlement organisation, and the modes of administrative and economic interactions between incoming Muslim rulers and pre-existing Byzantine communities inhabiting both islands have been the main fields of enquiry. Although taking a regional perspective, this thesis has been based on key case-study sites, of which Knossos and Heraklion, and Enna and its hinterland, are the principal ones. The Byzantine-Islamic transition of Sicily and Crete might appear as a peripheral topic to the eyes of scholars working in core territories of the Byzantine and Islamic empires. When considered within their actual geographical and cultural contexts, however, both islands stand at the virtual and spatial centre of the military and ideological confrontation between Byzantium and the Dar al-Islam. Placed at the crossroads between Constantinople, Cordoba, Mecca, and Baghdad, both islands acted as sociocultural and economic lynchpins between the western and eastern halves of the Mediterranean world, but also as maritime frontier-lands located at the fringe between the worlds of Islam and Byzantium

    Data Storage and Dissemination in Pervasive Edge Computing Environments

    Get PDF
    Nowadays, smart mobile devices generate huge amounts of data in all sorts of gatherings. Much of that data has localized and ephemeral interest, but can be of great use if shared among co-located devices. However, mobile devices often experience poor connectivity, leading to availability issues if application storage and logic are fully delegated to a remote cloud infrastructure. In turn, the edge computing paradigm pushes computations and storage beyond the data center, closer to end-user devices where data is generated and consumed. Hence, enabling the execution of certain components of edge-enabled systems directly and cooperatively on edge devices. This thesis focuses on the design and evaluation of resilient and efficient data storage and dissemination solutions for pervasive edge computing environments, operating with or without access to the network infrastructure. In line with this dichotomy, our goal can be divided into two specific scenarios. The first one is related to the absence of network infrastructure and the provision of a transient data storage and dissemination system for networks of co-located mobile devices. The second one relates with the existence of network infrastructure access and the corresponding edge computing capabilities. First, the thesis presents time-aware reactive storage (TARS), a reactive data storage and dissemination model with intrinsic time-awareness, that exploits synergies between the storage substrate and the publish/subscribe paradigm, and allows queries within a specific time scope. Next, it describes in more detail: i) Thyme, a data storage and dis- semination system for wireless edge environments, implementing TARS; ii) Parsley, a flexible and resilient group-based distributed hash table with preemptive peer relocation and a dynamic data sharding mechanism; and iii) Thyme GardenBed, a framework for data storage and dissemination across multi-region edge networks, that makes use of both device-to-device and edge interactions. The developed solutions present low overheads, while providing adequate response times for interactive usage and low energy consumption, proving to be practical in a variety of situations. They also display good load balancing and fault tolerance properties.Resumo Hoje em dia, os dispositivos móveis inteligentes geram grandes quantidades de dados em todos os tipos de aglomerações de pessoas. Muitos desses dados têm interesse loca- lizado e efêmero, mas podem ser de grande utilidade se partilhados entre dispositivos co-localizados. No entanto, os dispositivos móveis muitas vezes experienciam fraca co- nectividade, levando a problemas de disponibilidade se o armazenamento e a lógica das aplicações forem totalmente delegados numa infraestrutura remota na nuvem. Por sua vez, o paradigma de computação na periferia da rede leva as computações e o armazena- mento para além dos centros de dados, para mais perto dos dispositivos dos utilizadores finais onde os dados são gerados e consumidos. Assim, permitindo a execução de certos componentes de sistemas direta e cooperativamente em dispositivos na periferia da rede. Esta tese foca-se no desenho e avaliação de soluções resilientes e eficientes para arma- zenamento e disseminação de dados em ambientes pervasivos de computação na periferia da rede, operando com ou sem acesso à infraestrutura de rede. Em linha com esta dico- tomia, o nosso objetivo pode ser dividido em dois cenários específicos. O primeiro está relacionado com a ausência de infraestrutura de rede e o fornecimento de um sistema efêmero de armazenamento e disseminação de dados para redes de dispositivos móveis co-localizados. O segundo diz respeito à existência de acesso à infraestrutura de rede e aos recursos de computação na periferia da rede correspondentes. Primeiramente, a tese apresenta armazenamento reativo ciente do tempo (ARCT), um modelo reativo de armazenamento e disseminação de dados com percepção intrínseca do tempo, que explora sinergias entre o substrato de armazenamento e o paradigma pu- blicação/subscrição, e permite consultas num escopo de tempo específico. De seguida, descreve em mais detalhe: i) Thyme, um sistema de armazenamento e disseminação de dados para ambientes sem fios na periferia da rede, que implementa ARCT; ii) Pars- ley, uma tabela de dispersão distribuída flexível e resiliente baseada em grupos, com realocação preventiva de nós e um mecanismo de particionamento dinâmico de dados; e iii) Thyme GardenBed, um sistema para armazenamento e disseminação de dados em redes multi-regionais na periferia da rede, que faz uso de interações entre dispositivos e com a periferia da rede. As soluções desenvolvidas apresentam baixos custos, proporcionando tempos de res- posta adequados para uso interativo e baixo consumo de energia, demonstrando serem práticas nas mais diversas situações. Estas soluções também exibem boas propriedades de balanceamento de carga e tolerância a faltas

    An Integrated Framework for the Methodological Assurance of Security and Privacy in the Development and Operation of MultiCloud Applications

    Get PDF
    x, 169 p.This Thesis studies research questions about how to design multiCloud applications taking into account security and privacy requirements to protect the system from potential risks and about how to decide which security and privacy protections to include in the system. In addition, solutions are needed to overcome the difficulties in assuring security and privacy properties defined at design time still hold all along the system life-cycle, from development to operation.In this Thesis an innovative DevOps integrated methodology and framework are presented, which help to rationalise and systematise security and privacy analyses in multiCloud to enable an informed decision-process for risk-cost balanced selection of the protections of the system components and the protections to request from Cloud Service Providers used. The focus of the work is on the Development phase of the analysis and creation of multiCloud applications.The main contributions of this Thesis for multiCloud applications are four: i) The integrated DevOps methodology for security and privacy assurance; and its integrating parts: ii) a security and privacy requirements modelling language, iii) a continuous risk assessment methodology and its complementary risk-based optimisation of defences, and iv) a Security and Privacy Service Level AgreementComposition method.The integrated DevOps methodology and its integrating Development methods have been validated in the case study of a real multiCloud application in the eHealth domain. The validation confirmed the feasibility and benefits of the solution with regards to the rationalisation and systematisation of security and privacy assurance in multiCloud systems

    Linked democracy : foundations, tools, and applications

    Get PDF
    Chapter 1Introduction to Linked DataAbstractThis chapter presents Linked Data, a new form of distributed data on theweb which is especially suitable to be manipulated by machines and to shareknowledge. By adopting the linked data publication paradigm, anybody can publishdata on the web, relate it to data resources published by others and run artificialintelligence algorithms in a smooth manner. Open linked data resources maydemocratize the future access to knowledge by the mass of internet users, eitherdirectly or mediated through algorithms. Governments have enthusiasticallyadopted these ideas, which is in harmony with the broader open data movement
    corecore