103 research outputs found
Digital Humanities on the Semantic Web : accessing Historical and Musical Linked Data
Key fields in the humanities, such as history, art and language, are central to a major transformation that is changing scholarly practice in these fields: the so-called Digital Humanities (DH). A fundamental question in DH is how humanities datasets can be represented digitally, in such a way that machines can process them, understand their meaning, facilitate their inquiry, and exchange them on the Web. In this paper, we survey current efforts within the Semantic Web and Linked Data, a family of Webcompatible knowledge representation formalisms and standards, to represent DH objects in quantitative history and symbolic music. We also argue that the technological gap between the Semantic Web and Linked Data, and DH data owners is currently too wide for effective access and consumption of these semantically enabled humanities data. To this end, we propose grlc, a thin middleware that leverages currently existing queries on the Web (expressed in, e.g., SPARQL) to transparently build standard Web APIs that facilitate access to any Linked Data
List.MID: A MIDI-Based Benchmark for Evaluating RDF Lists
Linked lists represent a countable number of ordered values, and are among the most important abstract data types in computer science. With the advent of RDF as a highly expressive knowledge representation language for the Web, various implementations for RDF lists have been proposed. Yet, there is no benchmark so far dedicated to evaluating the performance of triple stores and SPARQL query engines on dealing with ordered linked data. Moreover, essential tasks for evaluating RDF lists, like generating datasets containing RDF lists of various sizes, or generating the same RDF list using different modelling choices, are cumbersome and unprincipled. In this paper, we propose List.MID, a systematic benchmark for evaluating systems serving RDF lists. List.MID consists of a dataset generator, which creates RDF list data in various models and of different sizes; and a set of SPARQL queries. The RDF list data is coherently generated from a large, community-curated base collection of Web MIDI files, rich in lists of musical events of arbitrary length. We describe the List.MID benchmark and discuss its impact and adoption, reusability, design, and availability
Development of modules for the GNU PDF project
English: This Final Degree Project (FDP) is a collaboration with a Free Software project: GNU PDF. The goal of the GNU PDF project is to develop and provide a free, high-quality, complete and portable set of libraries and programs to manage the PDF file format, and associated technologies. These libraries have a very important focus on conformance with PDF standards and specifications, which is not covered by any of the free software PDF libraries. Currently, the main activity in the GNU PDF project is the development of the GNU PDF Library. This library provides functions to read and write PDF documents conforming to standardized specifications. The main goal of this FDP is the development of modules for the GNU PDF Library. As described down below, the current status of the development aims at tasks involved in lexical, syntactic and semantic analysis of PDF files. These processes begin in the object layer of the library. Additionally, some other tasks on implementing basic features are still pending. These features belong to the base layer. This collaboration FDP makes contributions to both layers
Modelling and Querying Lists in RDF. A Pragmatic Study
Many Linked Data datasets model elements in their domains in the form of lists: a countable number of ordered resources. When pub- lishing these lists in RDF, an important concern is making them easy to consume. Therefore, a well-known recommendation is to find an existing list modelling solution, and reuse it. However, a specific domain model can be implemented in different ways and vocabularies may provide al- ternative solutions. In this paper, we argue that a wrong decision could have a significant impact in terms of performance and, ultimately, the availability of the data. We take the case of RDF Lists and make the hy- pothesis that the efficiency of retrieving sequential linked data depends primarily on how they are modelled (triple-store invariance hypothe- sis). To demonstrate this, we survey different solutions for modelling sequences in RDF, and propose a pragmatic approach for assessing their impact on data availability. Finally, we derive good (and bad) practices on how to publish lists as linked open data. By doing this, we sketch the foundations of an empirical, task-oriented methodology for benchmark- ing linked data modelling solutions
The Semantic Web MIDI Tape: An Interface for Interlinking MIDI and Context Metadata
The Linked Data paradigm has been used to publish a large number of musical datasets and ontologies on the Semantic Web, such as MusicBrainz, AcousticBrainz, and the Music Ontology. Recently, the MIDI Linked Data Cloud has been added to these datasets, representing more than 300,000 pieces in MIDI format as Linked Data, opening up the possibility for linking fine-grained symbolic music representations to existing music metadata databases. Despite the dataset making MIDI resources available in Web data standard formats such as RDF and SPARQL, the important issue of finding meaningful links between these MIDI resources and relevant contextual metadata in other datasets remains. A fundamental barrier for the provision and generation of such links is the difficulty that users have at adding new MIDI performance data and metadata to the platform. In this paper, we propose the Semantic Web MIDI Tape, a set of tools and associated interface for interacting with the MIDI Linked Data Cloud by enabling users to record, enrich, and retrieve MIDI performance data and related metadata in native Web data standards. The goal of such interactions is to find meaningful links between published MIDI resources and their relevant contextual metadata. We evaluate the Semantic Web MIDI Tape in various use cases involving user-contributed content, MIDI similarity querying, and entity recognition methods, and discuss their potential for finding links between MIDI resources and metadata
Service providers accountability
The goal of this paper is to guide through some obscure parts of the regulation and legislation related to technology. Even if we are not experts on security Internet, we will try to explain the difficulties that lawyers should be aware of when regulating rights and limits in the net. Some real cases related to service providers (ISP and others) are described and complemented with the technological context of each case
Recommended from our members
Sequential Linked Data: the State of Affairs
Sequences are among the most important data structures in computer science. In the Semantic Web, however, little attention has been given to Sequential Linked Data. In previous work, we have discussed the data models that Knowledge Graphs commonly use for representing sequences and showed how these models have an impact on query performance and that this impact is invariant to triplestore implementations. However, the specific list operations that the management of Sequential Linked Data requires beyond the simple retrieval of an entire list or a range of its elements --e.g. to add or remove elements from a list--, and their impact in the various list data models, remain unclear.
Covering this knowledge gap would be a significant step towards the realization of a Semantic Web list Application Programming Interface (API) that standardizes list manipulation and generalizes beyond specific data models.
In order to address these challenges towards the realization of such an API, we build on our previous work in understanding the effects of various sequential data models for Knowledge Graphs, extending our benchmark and proposing a set of read-write Semantic Web list operations in SPARQL, with insert, update and delete support. To do so, we identify five classic list-based computer science sequential data structures (linked list, double linked list, stack, queue, and array), from which we derive nine atomic read-write operations for Semantic Web lists. We propose a SPARQL implementation of these operations with five typical RDF data models and compare their performance by executing them against six increasing dataset sizes and four different triplestores. In light of our results, we discuss the feasibility of our devised API and reflect on the state of affairs of Sequential Linked Data
- …