3,343 research outputs found

    A Formal Context Representation Framework for Network-Enabled Cognition

    No full text
    Network-accessible resources are inherently contextual with respect to the specific situations (e.g., location and default assumptions) in which they are used. Therefore, the explicit conceptualization and representation of contexts is required to address a number of problems in Network- Enabled Cognition (NEC). We propose a context representation framework to address the computational specification of contexts. Our focus is on developing a formal model of context for the unambiguous and effective delivery of data and knowledge, in particular, for enabling forms of automated inference that address contextual differences between agents in a distributed network environment. We identify several components for the conceptualization of contexts within the context representation framework. These include jurisdictions (which can be used to interpret contextual data), semantic assumptions (which highlight the meaning of data), provenance information and inter-context relationships. Finally, we demonstrate the application of the context representation framework in a collaborative military coalition planning scenario. We show how the framework can be used to support the representation of plan-relevant contextual information

    A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data

    Full text link
    We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable meta-data on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Previous contributions on specific RDF annotation domains are encompassed by our unified reasoning formalism as we show by instantiating it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we provide a generic method for combining multiple annotation domains allowing to represent, e.g. temporally-annotated fuzzy RDF. Furthermore, we address the development of a query language -- AnQL -- that is inspired by SPARQL, including several features of SPARQL 1.1 (subqueries, aggregates, assignment, solution modifiers) along with the formal definitions of their semantics

    Text categorization and similarity analysis: similarity measure, literature review

    Get PDF
    Document classification and provenance has become an important area of computer science as the amount of digital information is growing significantly. Organisations are storing documents on computers rather than in paper form. Software is now required that will show the similarities between documents (i.e. document classification) and to point out duplicates and possibly the history of each document (i.e. provenance). Poor organisation is common and leads to situations like above. There exists a number of software solutions in this area designed to make document organisation as simple as possible. I'm doing my project with Pingar who are a company based in Auckland who aim to help organise the growing amount of unstructured digital data. This reports analyses the existing literature in this area with the aim to determine what already exists and how my project will be different from existing solutions

    Collaboratively Patching Linked Data

    Full text link
    Today's Web of Data is noisy. Linked Data often needs extensive preprocessing to enable efficient use of heterogeneous resources. While consistent and valid data provides the key to efficient data processing and aggregation we are facing two main challenges: (1st) Identification of erroneous facts and tracking their origins in dynamically connected datasets is a difficult task, and (2nd) efforts in the curation of deficient facts in Linked Data are exchanged rather rarely. Since erroneous data often is duplicated and (re-)distributed by mashup applications it is not only the responsibility of a few original publishers to keep their data tidy, but progresses to be a mission for all distributers and consumers of Linked Data too. We present a new approach to expose and to reuse patches on erroneous data to enhance and to add quality information to the Web of Data. The feasibility of our approach is demonstrated by example of a collaborative game that patches statements in DBpedia data and provides notifications for relevant changes.Comment: 2nd International Workshop on Usage Analysis and the Web of Data (USEWOD2012) in the 21st International World Wide Web Conference (WWW2012), Lyon, France, April 17th, 201

    LODE: Linking Digital Humanities Content to the Web of Data

    Full text link
    Numerous digital humanities projects maintain their data collections in the form of text, images, and metadata. While data may be stored in many formats, from plain text to XML to relational databases, the use of the resource description framework (RDF) as a standardized representation has gained considerable traction during the last five years. Almost every digital humanities meeting has at least one session concerned with the topic of digital humanities, RDF, and linked data. While most existing work in linked data has focused on improving algorithms for entity matching, the aim of the LinkedHumanities project is to build digital humanities tools that work "out of the box," enabling their use by humanities scholars, computer scientists, librarians, and information scientists alike. With this paper, we report on the Linked Open Data Enhancer (LODE) framework developed as part of the LinkedHumanities project. With LODE we support non-technical users to enrich a local RDF repository with high-quality data from the Linked Open Data cloud. LODE links and enhances the local RDF repository without compromising the quality of the data. In particular, LODE supports the user in the enhancement and linking process by providing intuitive user-interfaces and by suggesting high-quality linking candidates using tailored matching algorithms. We hope that the LODE framework will be useful to digital humanities scholars complementing other digital humanities tools

    Towards Building a Knowledge Base of Monetary Transactions from a News Collection

    Full text link
    We address the problem of extracting structured representations of economic events from a large corpus of news articles, using a combination of natural language processing and machine learning techniques. The developed techniques allow for semi-automatic population of a financial knowledge base, which, in turn, may be used to support a range of data mining and exploration tasks. The key challenge we face in this domain is that the same event is often reported multiple times, with varying correctness of details. We address this challenge by first collecting all information pertinent to a given event from the entire corpus, then considering all possible representations of the event, and finally, using a supervised learning method, to rank these representations by the associated confidence scores. A main innovative element of our approach is that it jointly extracts and stores all attributes of the event as a single representation (quintuple). Using a purpose-built test set we demonstrate that our supervised learning approach can achieve 25% improvement in F1-score over baseline methods that consider the earliest, the latest or the most frequent reporting of the event.Comment: Proceedings of the 17th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '17), 201

    A-posteriori provenance-enabled linking of publications and datasets via crowdsourcing

    No full text
    This paper aims to share with the digital library community different opportunities to leverage crowdsourcing for a-posteriori capturing of dataset citation graphs. We describe a practical approach, which exploits one possible crowdsourcing technique to collect these graphs from domain experts and proposes their publication as Linked Data using the W3C PROV standard. Based on our findings from a study we ran during the USEWOD 2014 workshop, we propose a semi-automatic approach that generates metadata by leveraging information extraction as an additional step to crowdsourcing, to generate high-quality data citation graphs. Furthermore, we consider the design implications on our crowdsourcing approach when non-expert participants are involved in the process<br/
    corecore