Search CORE

3,861 research outputs found

SMART-KG: Hybrid Shipping for SPARQL Querying on the Web

Author: Amr Azzam
Fernandez Garcia Javier David
Maribel Acosta
Polleres Axel
Publication venue: Department für Informationsverarbeitung und Prozessmanagement
Publication date: 16/01/2020
Field of study

While Linked Data (LD) provides standards for publishing (RDF) and (SPARQL) querying Knowledge Graphs (KGs) on the Web, serving, accessing and processing such open, decentralized KGs is often practically impossible, as query timeouts on publicly available SPARQL endpoints show. Alternative solutions such as Triple Pattern Fragments (TPF) attempt to tackle the problem of availability by pushing query processing workload to the client side, but suffer from unnecessary transfer of irrelevant data on complex queries with large intermediate results. In this paper we present smart-KG, a novel approach to share the load between servers and clients, while significantly reducing data transfer volume, by combining TPF with shipping compressed KG partitions. Our evaluations show that outperforms state-of-the-art client-side solutions and increases server-side availability towards more cost-effective and balanced hosting of open and decentralized KGs.Series: Working Papers on Information Systems, Information Business and Operation

Elektronische Publikationen der Wirtschaftsuniversität Wien

BEAR: Benchmarking the Efficiency of RDF Archiving

Author: Fernandez Garcia Javier David
Polleres Axel
Umbrich Jürgen
Publication venue: Department für Informationsverarbeitung und Prozessmanagement, WU Vienna University of Economics and Business
Publication date: 01/01/2015
Field of study

There is an emerging demand on techniques addressing the problem of efficiently archiving and (temporal) querying different versions of evolving semantic Web data. While systems archiving and/or temporal querying are still in their early days, we consider this a good time to discuss benchmarks for evaluating storage space efficiency for archives, retrieval functionality they serve, and the performance of various retrieval operations. To this end, we provide a blueprint on benchmarking archives of semantic data by defining a concise set of operators that cover the major aspects of querying of and interacting with such archives. Next, we introduce BEAR, which instantiates this blueprint to serve a concrete set of queries on the basis of real-world evolving data. Finally, we perform an empirical evaluation of current archiving techniques that is meant to serve as a first baseline of future developments on querying archives of evolving RDF data. (authors' abstract)Series: Working Papers on Information Systems, Information Business and Operation

Elektronische Publikationen der Wirtschaftsuniversität Wien

Evaluating Query and Storage Strategies for RDF Archives

Author: Fernandez Garcia Javier David
Knuth Magnus
Polleres Axel
Umbrich Jürgen
Publication venue: 'IOS Press'
Publication date: 01/01/2018
Field of study

There is an emerging demand on efficiently archiving and (temporal) querying different versions of evolving semantic Web data. As novel archiving systems are starting to address this challenge, foundations/standards for benchmarking RDF archives are needed to evaluate its storage space efficiency and the performance of different retrieval operations. To this end, we provide theoretical foundations on the design of data and queries to evaluate emerging RDF archiving systems. Then, we instantiate these foundations along a concrete set of queries on the basis of a real-world evolving dataset. Finally, we perform an empirical evaluation of various current archiving techniques and querying strategies on this data that is meant to serve as a baseline of future developments on querying archives of evolving RDF data

Elektronische Publikationen der Wirtschaftsuniversität Wien

HDT crypt: Compression and Encryption of RDF Datasets

Author: Fernandez Garcia Javier David
Kirrane Sabrina
Polleres Axel
Steyskal Simon
Publication venue: 'IOS Press'
Publication date: 01/01/2018
Field of study

The publication and interchange of RDF datasets online has experienced significant growth in recent years, promoted by different but complementary efforts, such as Linked Open Data, the Web of Things and RDF stream processing systems. However, the current Linked Data infrastructure does not cater for the storage and exchange of sensitive or private data. On the one hand, data publishers need means to limit access to confidential data (e.g. health, financial, personal, or other sensitive data). On the other hand, the infrastructure needs to compress RDF graphs in a manner that minimises the amount of data that is both stored and transferred over the wire. In this paper, we demonstrate how HDT - a compressed serialization format for RDF - can be extended to cater for supporting encryption. We propose a number of different graph partitioning strategies and discuss the benefits and tradeoffs of each approach

Elektronische Publikationen der Wirtschaftsuniversität Wien

Privacy-aware Linked Widgets

Author: Aryan Peb Ruswono
Azzam Amr
Ekaputra Fajar J.
Fernandez Garcia Javier David
Kiesling Elmar
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

The European General Data Protection Regulation (GDPR) brings new challenges for companies, who must demonstrate that their systems and business processes comply with usage constraints specified by data subjects. However, due to the lack of standards, tools, and best practices, many organizations struggle to adapt their infrastructure and processes to ensure and demonstrate that all data processing is in compliance with users' given consent. The SPECIAL EU H2020 project has developed vocabularies that can formally describe data subjects' given consent as well as methods that use this description to automatically determine whether processing of the data according to a given policy is compliant with the given consent. Whereas this makes it possible to determine whether processing was compliant or not, integration of the approach into existing line of business applications and ex-ante compliance checking remains an open challenge. In this short paper, we demonstrate how the SPECIAL consent and compliance framework can be integrated into Linked Widgets, a mashup platform, in order to support privacy-aware ad-hoc integration of personal data. The resulting environment makes it possible to create data integration and processing workflows out of components that inherently respect usage policies of the data that is being processed and are able to demonstrate compliance. We provide an overview of the necessary meta data and orchestration towards a privacy-aware linked data mashup platform that automatically respects subjects' given consents. The evaluation results show the potential of our approach for ex-ante usage policy compliance checking within the Linked Widgets Platforms and beyond

Elektronische Publikationen der Wirtschaftsuniversität Wien

A More Decentralized Vision for Linked Data

Author: Fernandez Garcia Javier David
Kamdar Maulik R.
Musen Mark A.
Polleres Axel
Tudorache Tania
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2018
Field of study

We claim that ten years into Linked Data there are still many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. With a focus on the the biomedical domain, currently, one of the most promising adopters of Linked Data, we highlight and exemplify key technical and non-technical challenges to the success of Linked Data, and we outline potential solution strategies

Elektronische Publikationen der Wirtschaftsuniversität Wien

A More Decentralized Vision for Linked Data

Author: Fernandez Garcia Javier David
Kamdar Maulik R.
Musen Mark A.
Polleres Axel
Tudorache Tania
Publication venue: Department für Informationsverarbeitung und Prozessmanagement, WU Vienna University of Economics and Business
Publication date: 25/06/2018
Field of study

In this deliberately provocative position paper, we claim that ten years into Linked Data there are still (too?) many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. We take a deeper look at the biomedical domain - currently, one of the most promising "adopters" of Linked Data - if we believe the ever-present "LOD cloud" diagram. Herein, we try to highlight and exemplify key technical and non-technical challenges to the success of LOD, and we outline potential solution strategies. We hope that this paper will serve as a discussion basis for a fresh start towards more actionable, truly decentralized Linked Data, and as a call to the community to join forces.Series: Working Papers on Information Systems, Information Business and Operation

Elektronische Publikationen der Wirtschaftsuniversität Wien

Enabling Web-scale data integration in biomedicine through Linked Open Data

Author: Fernandez Garcia Javier David
Kamdar Maulik R.
Musen Mark A.
Polleres Axel
Tudorache Tania
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The biomedical data landscape is fragmented with several isolated, heterogeneous data and knowledge sources, which use varying formats, syntaxes, schemas, and entity notations, existing on the Web. Biomedical researchers face severe logistical and technical challenges to query, integrate, analyze, and visualize data from multiple diverse sources in the context of available biomedical knowledge. Semantic Web technologies and Linked Data principles may aid toward Web-scale semantic processing and data integration in biomedicine. The biomedical research community has been one of the earliest adopters of these technologies and principles to publish data and knowledge on the Web as linked graphs and ontologies, hence creating the Life Sciences Linked Open Data (LSLOD) cloud. In this paper, we provide our perspective on some opportunities proffered by the use of LSLOD to integrate biomedical data and knowledge in three domains: (1) pharmacology, (2) cancer research, and (3) infectious diseases. We will discuss some of the major challenges that hinder the wide-spread use and consumption of LSLOD by the biomedical research community. Finally, we provide a few technical solutions and insights that can address these challenges. Eventually, LSLOD can enable the development of scalable, intelligent infrastructures that support artificial intelligence methods for augmenting human intelligence to achieve better clinical outcomes for patients, to enhance the quality of biomedical research, and to improve our understanding of living systems

Elektronische Publikationen der Wirtschaftsuniversität Wien

Data Privacy Vocabularies and Controls: Semantic Web for Transparency and Privacy

Author: Bonatti Piero A.
Bos Bert
Decker Stefan
Fernandez Garcia Javier David
Kirrane Sabrina
Peristeras Vassilios
Polleres Axel
Wenning Rigo
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2018
Field of study

Managing Privacy and understanding the handling of personal data has turned into a fundamental right-at least for Europeans-since May 25th with the coming into force of the General Data Protection Regulation. Yet, whereas many different tools by different vendors promise companies to guarantee their compliance to GDPR in terms of consent management and keeping track of the personal data they handle in their processes, interoperability between such tools as well uniform user facing interfaces will be needed to enable true transparency, user-configurable and -manageable privacy policies and data portability (as also implicitly promised by GDPR). We argue that such interoperability can be enabled by agreed upon vocabularies and Linked Data

Elektronische Publikationen der Wirtschaftsuniversität Wien

Propelling the Potential of Enterprise Linked Data in Austria. Roadmap and Report

Author: Fernandez Garcia Javier David
Kiesling Elmar
Kirrane Sabrina
Mizerski Nika
Neuschmid Julia
Polleres Axel
Sabou Marta
Thurner Thomas
Wetz Peter
Publication venue: edition mono/monochrom
Publication date: 01/01/2016
Field of study

In times of digital transformation and considering the potential of the data-driven economy, it is crucial that data is not only made available, data sources can be trusted, but also data integrity can be guaranteed, necessary privacy and security mechanisms are in place, and data and access comply with policies and legislation. In many cases, complex and interdisciplinary questions cannot be answered by a single dataset and thus it is necessary to combine data from multiple disparate sources. However, because most data today is locked up in isolated silos, data cannot be used to its fullest potential. The core challenge for most organisations and enterprises in regards to data exchange and integration is to be able to combine data from internal and external data sources in a manner that supports both day to day operations and innovation. Linked Data is a promising data publishing and integration paradigm that builds upon standard web technologies. It supports the publishing of structured data in a semantically explicit and interlinked manner such that it can be easily connected, and consequently becomes more interoperable and useful. The PROPEL project - Propelling the Potential of Enterprise Linked Data in Austria - surveyed technological challenges, entrepreneurial opportunities, and open research questions on the use of Linked Data in a business context and developed a roadmap and a set of recommendations for policy makers, industry, and the research community. Shifting away from a predominantly academic perspective and an exclusive focus on open data, the project looked at Linked Data as an emerging disruptive technology that enables efficient enterprise data management in the rising data economy. Current market forces provide many opportunities, but also present several data and information management challenges. Given that Linked Data enables advanced analytics and decision-making, it is particularly suitable for addressing today's data and information management challenges. In our research, we identified a variety of highly promising use cases for Linked Data in an enterprise context. Examples of promising application domains include "customization and customer relationship management", "automatic and dynamic content production, adaption and display", "data search, information retrieval and knowledge discovery", as well as "data and information exchange and integration". The analysis also revealed broad potential across a large spectrum of industries whose structural and technological characteristics align well with Linked Data characteristics and principles: energy, retail, finance and insurance, government, health, transport and logistics, telecommunications, media, tourism, engineering, and research and development rank among the most promising industries for the adoption of Linked Data principles. In addition to approaching the subject from an industry perspective, we also examined the topics and trends emerging from the research community in the field of Linked Data and the Semantic Web. Although our analysis revolved around a vibrant and active community composed of academia and leading companies involved in semantic technologies, we found that industry needs and research discussions are somewhat misaligned. Whereas some foundation technologies such as knowledge representation and data creation/publishing/sharing, data management and system engineering are highly represented in scientific papers, specific topics such as recommendations, or cross-topics such as machine learning or privacy and security are marginally present. Topics such as big/large data and the internet of things are (still) on an upward trajectory in terms of attention. In contrast, topics that are very relevant for industry such as application oriented topics or those that relate to security, privacy and robustness are not attracting much attention. When it comes to standardisation efforts, we identified a clear need for a more in-depth analysis into the effectiveness of existing standards, the degree of coverage they provide with respect the foundations they belong to, and the suitability of alternative standards that do not fall under the core Semantic Web umbrella. Taking into consideration market forces, sector analysis of Linked Data potential, demand side analysis and the current technological status it is clear that Linked Data has a lot of potential for enterprises and can act as a key driver of technological, organizational, and economic change. However, in order to ensure a solid foundation for Enterprise Linked Data include there is a need for: greater awareness surrounding the potential of Linked Data in enterprises, lowering of entrance barriers via education and training, better alignment between industry demands and research activities, greater support for technology transfer from universities to companies. The PROPEL roadmap recommends concrete measures in order to propel the adoption of Linked Data in Austrian enterprises. These measures are structured around five fields of activities: "awareness and education", "technological innovation, research gaps, standardisation", "policy and legal", and "funding". Key short-term recommendations include the clustering of existing activities in order to raise visibility on an international level, the funding of key topics that are under represented by the community, and the setup of joint projects. In the medium term, we recommend the strengthening of existing academic and private education efforts via certification and to establish flagship projects that are based on national use cases that can serve as blueprints for transnational initiatives. This requires not only financial support, but also infrastructure support, such as data and services to build solutions on top. In the long term, we recommend cooperation with international funding schemes to establish and foster a European level agenda, and the setup of centres of excellence

Elektronische Publikationen der Wirtschaftsuniversität Wien