12,469 research outputs found

    LODE: Linking Digital Humanities Content to the Web of Data

    Full text link
    Numerous digital humanities projects maintain their data collections in the form of text, images, and metadata. While data may be stored in many formats, from plain text to XML to relational databases, the use of the resource description framework (RDF) as a standardized representation has gained considerable traction during the last five years. Almost every digital humanities meeting has at least one session concerned with the topic of digital humanities, RDF, and linked data. While most existing work in linked data has focused on improving algorithms for entity matching, the aim of the LinkedHumanities project is to build digital humanities tools that work "out of the box," enabling their use by humanities scholars, computer scientists, librarians, and information scientists alike. With this paper, we report on the Linked Open Data Enhancer (LODE) framework developed as part of the LinkedHumanities project. With LODE we support non-technical users to enrich a local RDF repository with high-quality data from the Linked Open Data cloud. LODE links and enhances the local RDF repository without compromising the quality of the data. In particular, LODE supports the user in the enhancement and linking process by providing intuitive user-interfaces and by suggesting high-quality linking candidates using tailored matching algorithms. We hope that the LODE framework will be useful to digital humanities scholars complementing other digital humanities tools

    Next steps in implementing Kaput's research programme

    Get PDF
    We explore some key constructs and research themes initiated by Jim Kaput, and attempt to illuminate them further with reference to our own research. These 'design principles' focus on the evolution of digital representations since the early nineties, and we attempt to take forward our collective understanding of the cognitive and cultural affordances they offer. There are two main organising ideas for the paper. The first centres around Kaput's notion of outsourcing of processing power, and explores the implications of this for mathematical learning. We argue that a key component for design is to create visible, transparent views of outsourcing, a transparency without which there may be as many pitfalls as opportunities for mathematical learning. The second organising idea is that of communication, a key notion for Kaput, and the importance of designing for communication in ways that recognise the mutual influence of tools for communication and for mathematical expression

    Context Aware Computing for The Internet of Things: A Survey

    Get PDF
    As we are moving towards the Internet of Things (IoT), the number of sensors deployed around the world is growing at a rapid pace. Market research has shown a significant growth of sensor deployments over the past decade and has predicted a significant increment of the growth rate in the future. These sensors continuously generate enormous amounts of data. However, in order to add value to raw sensor data we need to understand it. Collection, modelling, reasoning, and distribution of context in relation to sensor data plays critical role in this challenge. Context-aware computing has proven to be successful in understanding sensor data. In this paper, we survey context awareness from an IoT perspective. We present the necessary background by introducing the IoT paradigm and context-aware fundamentals at the beginning. Then we provide an in-depth analysis of context life cycle. We evaluate a subset of projects (50) which represent the majority of research and commercial solutions proposed in the field of context-aware computing conducted over the last decade (2001-2011) based on our own taxonomy. Finally, based on our evaluation, we highlight the lessons to be learnt from the past and some possible directions for future research. The survey addresses a broad range of techniques, methods, models, functionalities, systems, applications, and middleware solutions related to context awareness and IoT. Our goal is not only to analyse, compare and consolidate past research work but also to appreciate their findings and discuss their applicability towards the IoT.Comment: IEEE Communications Surveys & Tutorials Journal, 201

    Learning Expressive Linkage Rules for Entity Matching using Genetic Programming

    Get PDF
    A central problem in data integration and data cleansing is to identify pairs of entities in data sets that describe the same real-world object. Many existing methods for matching entities rely on explicit linkage rules, which specify how two entities are compared for equivalence. Unfortunately, writing accurate linkage rules by hand is a non-trivial problem that requires detailed knowledge of the involved data sets. Another important issue is the efficient execution of linkage rules. In this thesis, we propose a set of novel methods that cover the complete entity matching workflow from the generation of linkage rules using genetic programming algorithms to their efficient execution on distributed systems. First, we propose a supervised learning algorithm that is capable of generating linkage rules from a gold standard consisting of set of entity pairs that have been labeled as duplicates or non-duplicates. We show that the introduced algorithm outperforms previously proposed entity matching approaches including the state-of-the-art genetic programming approach by de Carvalho et al. and is capable of learning linkage rules that achieve a similar accuracy than the human written rule for the same problem. In order to also cover use cases for which no gold standard is available, we propose a complementary active learning algorithm that generates a gold standard interactively by asking the user to confirm or decline the equivalence of a small number of entity pairs. In the experimental evaluation, labeling at most 50 link candidates was necessary in order to match the performance that is achieved by the supervised GenLink algorithm on the entire gold standard. Finally, we propose an efficient execution workflow that can be run on cluster of multiple machines. The execution workflow employs a novel multidimensional indexing method that allows the efficient execution of learned linkage rules by reducing the number of required comparisons significantly

    Uncertainty-sensitive reasoning for inferring sameAs facts in linked data

    Get PDF
    albakri2016aInternational audienceDiscovering whether or not two URIs described in Linked Data -- in the same or different RDF datasets -- refer to the same real-world entity is crucial for building applications that exploit the cross-referencing of open data. A major challenge in data interlinking is to design tools that effectively deal with incomplete and noisy data, and exploit uncertain knowledge. In this paper, we model data interlinking as a reasoning problem with uncertainty. We introduce a probabilistic framework for modelling and reasoning over uncertain RDF facts and rules that is based on the semantics of probabilistic Datalog. We have designed an algorithm, ProbFR, based on this framework. Experiments on real-world datasets have shown the usefulness and effectiveness of our approach for data linkage and disambiguation
    • …
    corecore