3,861 research outputs found
SMART-KG: Hybrid Shipping for SPARQL Querying on the Web
While Linked Data (LD) provides standards for publishing (RDF) and (SPARQL) querying Knowledge Graphs (KGs) on the Web, serving, accessing and processing such open, decentralized KGs is often practically impossible, as query timeouts on publicly available SPARQL endpoints show. Alternative solutions such as Triple Pattern Fragments (TPF) attempt to tackle the problem of availability by pushing query processing workload to the client side, but suffer from unnecessary transfer of irrelevant data on complex queries with large intermediate results. In this paper we present smart-KG, a novel approach to share the load between servers and clients, while significantly reducing data transfer volume, by combining TPF with shipping compressed KG partitions. Our evaluations show that outperforms state-of-the-art client-side solutions and increases server-side availability towards more cost-effective and balanced hosting of open and decentralized KGs.Series: Working Papers on Information Systems, Information Business and Operation
BEAR: Benchmarking the Efficiency of RDF Archiving
There is an emerging demand on techniques addressing the
problem of efficiently archiving and (temporal) querying different versions of evolving semantic Web data. While systems archiving and/or temporal querying are still in their early days, we consider this a good time to discuss benchmarks for evaluating storage space efficiency for
archives, retrieval functionality they serve, and the performance of various retrieval operations. To this end, we provide a blueprint on benchmarking archives of semantic data by defining a concise set of operators that cover the major aspects of querying of and interacting with such
archives. Next, we introduce BEAR, which instantiates this blueprint to serve a concrete set of queries on the basis of real-world evolving data. Finally, we perform an empirical evaluation of current archiving techniques that is meant to serve as a first baseline of future developments
on querying archives of evolving RDF data. (authors' abstract)Series: Working Papers on Information Systems, Information Business and Operation
Evaluating Query and Storage Strategies for RDF Archives
There is an emerging demand on efficiently archiving and (temporal) querying different versions of evolving semantic Web data. As novel archiving systems are starting to address this challenge, foundations/standards for benchmarking RDF archives are needed to evaluate its storage space efficiency and the performance of different retrieval operations. To this end, we provide theoretical foundations on the design of data and queries to evaluate emerging RDF archiving systems. Then, we instantiate these foundations along a concrete set of queries on the basis of a real-world evolving dataset. Finally, we perform an empirical evaluation of various current archiving techniques and querying strategies on this data that is meant to serve as a baseline of future developments on querying archives of evolving RDF data
HDT crypt: Compression and Encryption of RDF Datasets
The publication and interchange of RDF datasets online has experienced significant growth in recent years, promoted by different but complementary efforts, such as Linked Open Data, the Web of Things and RDF stream processing systems. However, the current Linked Data infrastructure does not cater for the storage and exchange of sensitive or private data. On the one hand, data publishers need means to limit access to confidential data (e.g. health, financial, personal, or other sensitive data). On the other hand, the infrastructure needs to compress RDF graphs in a manner that minimises the amount of data that is both stored and transferred over the wire. In this paper, we demonstrate how HDT - a compressed serialization format for RDF - can be extended to cater for supporting encryption. We propose a number of different graph partitioning strategies and discuss the benefits and tradeoffs of each approach
Privacy-aware Linked Widgets
The European General Data Protection Regulation (GDPR) brings
new challenges for companies, who must demonstrate that their
systems and business processes comply with usage constraints
specified by data subjects. However, due to the lack of standards,
tools, and best practices, many organizations struggle to adapt their
infrastructure and processes to ensure and demonstrate that all
data processing is in compliance with users' given consent. The
SPECIAL EU H2020 project has developed vocabularies that can
formally describe data subjects' given consent as well as methods
that use this description to automatically determine whether
processing of the data according to a given policy is compliant
with the given consent. Whereas this makes it possible to determine
whether processing was compliant or not, integration of the
approach into existing line of business applications and ex-ante
compliance checking remains an open challenge. In this short paper,
we demonstrate how the SPECIAL consent and compliance framework
can be integrated into Linked Widgets, a mashup platform, in
order to support privacy-aware ad-hoc integration of personal data.
The resulting environment makes it possible to create data integration
and processing workflows out of components that inherently
respect usage policies of the data that is being processed and are
able to demonstrate compliance. We provide an overview of the
necessary meta data and orchestration towards a privacy-aware
linked data mashup platform that automatically respects subjects'
given consents. The evaluation results show the potential of our
approach for ex-ante usage policy compliance checking within the
Linked Widgets Platforms and beyond
A More Decentralized Vision for Linked Data
We claim that ten years into Linked Data there are still many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. With a focus on the the biomedical domain, currently, one of the most promising adopters of Linked Data, we highlight and exemplify key technical and non-technical challenges to the success of Linked Data, and we outline potential solution strategies
A More Decentralized Vision for Linked Data
In this deliberately provocative position paper, we claim that ten years into Linked Data there are still (too?) many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. We take a deeper look at the biomedical domain - currently, one of the most promising "adopters" of Linked Data - if we believe the ever-present "LOD cloud" diagram. Herein, we try to highlight and exemplify key technical and non-technical challenges to the success of LOD, and we outline potential solution strategies. We hope that this paper will serve as a discussion basis for a fresh start towards more actionable, truly decentralized Linked Data, and as a call to the community to join forces.Series: Working Papers on Information Systems, Information Business and Operation
Enabling Web-scale data integration in biomedicine through Linked Open Data
The biomedical data landscape is fragmented with several isolated, heterogeneous data and knowledge sources, which use varying formats, syntaxes, schemas, and entity notations, existing on the Web. Biomedical researchers face severe logistical and technical challenges to query, integrate, analyze, and visualize data from multiple diverse sources in the context of available biomedical knowledge. Semantic Web technologies and Linked Data principles may aid toward Web-scale semantic processing and data integration in biomedicine. The biomedical research community has been one of the earliest adopters of these technologies and principles to publish data and knowledge on the Web as linked graphs and ontologies, hence creating the Life Sciences Linked Open Data (LSLOD) cloud. In this paper, we provide our perspective on some opportunities proffered by the use of LSLOD to integrate biomedical data and knowledge in three domains: (1) pharmacology, (2) cancer research, and (3) infectious diseases. We will discuss some of the major challenges that hinder the wide-spread use and consumption of LSLOD by the biomedical research community. Finally, we provide a few technical solutions and insights that can address these challenges. Eventually, LSLOD can enable the development of scalable, intelligent infrastructures that support artificial intelligence methods for augmenting human intelligence to achieve better clinical outcomes for patients, to enhance the quality of biomedical research, and to improve our understanding of living systems
Data Privacy Vocabularies and Controls: Semantic Web for Transparency and Privacy
Managing Privacy and understanding the handling of personal data has turned into a fundamental right-at least for Europeans-since May 25th with the coming into force of the General Data Protection Regulation. Yet, whereas many different tools by different vendors promise companies to guarantee their compliance to GDPR in terms of consent management and keeping track of the personal data they handle in their processes, interoperability between such tools as well uniform user facing interfaces will be needed to enable true transparency, user-configurable and -manageable privacy policies and data portability (as also implicitly promised by GDPR). We argue that such interoperability can be enabled by agreed upon vocabularies and Linked Data
Propelling the Potential of Enterprise Linked Data in Austria. Roadmap and Report
In times of digital transformation and considering the potential of the data-driven
economy, it is crucial that data is not only made available, data sources can be trusted,
but also data integrity can be guaranteed, necessary privacy and security mechanisms
are in place, and data and access comply with policies and legislation. In many cases,
complex and interdisciplinary questions cannot be answered by a single dataset and
thus it is necessary to combine data from multiple disparate sources. However, because
most data today is locked up in isolated silos, data cannot be used to its fullest
potential.
The core challenge for most organisations and enterprises in regards to data exchange
and integration is to be able to combine data from internal and external data sources
in a manner that supports both day to day operations and innovation. Linked Data is a
promising data publishing and integration paradigm that builds upon standard web
technologies. It supports the publishing of structured data in a semantically explicit
and interlinked manner such that it can be easily connected, and consequently becomes
more interoperable and useful.
The PROPEL project - Propelling the Potential of Enterprise Linked Data in Austria - surveyed technological challenges, entrepreneurial opportunities, and open research
questions on the use of Linked Data in a business context and developed a roadmap and a set of recommendations for policy makers, industry, and the research community.
Shifting away from a predominantly academic perspective and an exclusive focus on open data, the project looked at Linked Data as an emerging disruptive technology
that enables efficient enterprise data management in the rising data economy. Current market forces provide many opportunities, but also present several data and
information management challenges. Given that Linked Data enables advanced analytics and decision-making, it is particularly suitable for addressing today's data and
information management challenges. In our research, we identified a variety of highly promising use cases for Linked Data in an enterprise context. Examples of promising
application domains include "customization and customer relationship management", "automatic and dynamic content production, adaption and display", "data search, information
retrieval and knowledge discovery", as well as "data and information exchange and integration". The analysis also revealed broad potential across a large spectrum of
industries whose structural and technological characteristics align well with Linked
Data characteristics and principles: energy, retail, finance and insurance, government, health, transport and logistics, telecommunications, media, tourism, engineering, and research and development rank among the most promising industries for the adoption of Linked Data principles.
In addition to approaching the subject from an industry perspective, we also examined the topics and trends emerging from the research community in the field of Linked Data and the Semantic Web. Although our analysis revolved around a vibrant and active community composed of academia and leading companies involved in semantic technologies, we found that industry needs and research discussions are
somewhat misaligned. Whereas some foundation technologies such as knowledge representation and data creation/publishing/sharing, data management and system
engineering are highly represented in scientific papers, specific topics such as recommendations, or cross-topics such as machine learning or privacy and security are marginally
present. Topics such as big/large data and the internet of things are (still) on an
upward trajectory in terms of attention. In contrast, topics that are very relevant for
industry such as application oriented topics or those that relate to security, privacy
and robustness are not attracting much attention. When it comes to standardisation
efforts, we identified a clear need for a more in-depth analysis into the effectiveness of
existing standards, the degree of coverage they provide with respect the foundations
they belong to, and the suitability of alternative standards that do not fall under the
core Semantic Web umbrella.
Taking into consideration market forces, sector analysis of Linked Data potential, demand
side analysis and the current technological status it is clear that Linked Data
has a lot of potential for enterprises and can act as a key driver of technological, organizational,
and economic change. However, in order to ensure a solid foundation
for Enterprise Linked Data include there is a need for: greater awareness surrounding
the potential of Linked Data in enterprises, lowering of entrance barriers via education
and training, better alignment between industry demands and research activities,
greater support for technology transfer from universities to companies.
The PROPEL roadmap recommends concrete measures in order to propel the adoption
of Linked Data in Austrian enterprises. These measures are structured around five
fields of activities: "awareness and education", "technological innovation, research gaps,
standardisation", "policy and legal", and "funding". Key short-term recommendations include the clustering of existing activities in order to raise visibility on an international level, the funding of key topics that are under represented by the community, and the setup of joint projects. In the medium term, we recommend the strengthening
of existing academic and private education efforts via certification and to establish flagship projects that are based on national use cases that can serve as blueprints for transnational initiatives. This requires not only financial support, but also infrastructure support, such as data and services to build solutions on top. In the long term, we
recommend cooperation with international funding schemes to establish and foster a European level agenda, and the setup of centres of excellence
- …