394 research outputs found
Design Challenges for GDPR RegTech
The Accountability Principle of the GDPR requires that an organisation can
demonstrate compliance with the regulations. A survey of GDPR compliance
software solutions shows significant gaps in their ability to demonstrate
compliance. In contrast, RegTech has recently brought great success to
financial compliance, resulting in reduced risk, cost saving and enhanced
financial regulatory compliance. It is shown that many GDPR solutions lack
interoperability features such as standard APIs, meta-data or reports and they
are not supported by published methodologies or evidence to support their
validity or even utility. A proof of concept prototype was explored using a
regulator based self-assessment checklist to establish if RegTech best practice
could improve the demonstration of GDPR compliance. The application of a
RegTech approach provides opportunities for demonstrable and validated GDPR
compliance, notwithstanding the risk reductions and cost savings that RegTech
can deliver. This paper demonstrates a RegTech approach to GDPR compliance can
facilitate an organisation meeting its accountability obligations
Towards an automatic data value analysis method for relational databases
Data is becoming one of the world’s most valuable resources and it is suggested that those who own the data will own the future. However, despite data being an important asset, data owners struggle to assess its value. Some recent pioneer works have led to an increased awareness of the necessity for measuring data value. They have also put forward some simple but engaging survey-based methods to help with the first-level data assessment in an organisation. However, these methods are manual and they depend on the costly input of domain experts. In this paper, we propose to extend the manual survey-based approaches with additional metrics and dimensions derived from the evolving literature on data value dimensions and tailored specifically for our use case study. We also developed an automatic, metric-based data value assessment approach that (i) automatically quantifies the business value of data in Relational Databases (RDB), and (ii) provides a scoring method that facilitates the ranking and extraction of the most valuable RDB tables. We evaluate our proposed approach on a real-world RDB database from a small online retailer (MyVolts) and show in our experimental study that the data value assessments made by our automated system match those expressed by the domain expert approach
An intelligent linked data quality dashboard
This paper describes a new intelligent, data-driven dashboard for linked data quality assessment. The development goal was to assist data quality engineers to interpret data quality problems found when evaluating a dataset us-ing a metrics-based data quality assessment. This required construction of a graph linking the problematic things identified in the data, the assessment metrics and the source data. This context and supporting user interfaces help the user to un-derstand data quality problems. An analysis widget also helped the user identify the root cause multiple problems. This supported the user in identification and prioritization of the problems that need to be fixed and to improve data quality. The dashboard was shown to be useful for users to clean data. A user evaluation was performed with both expert and novice data quality engineers
DELTA-R: a change detection approach for RDF datasets
This paper presents the DELTA-R approach that detects and
classifies the changes between two versions of a linked dataset. It contributes
to the state of the art firstly: by proposing a more granular classification of
the resource level changes, and secondly: by automatically selecting the
appropriate resource properties to identify the same resources in different
versions of a linked dataset with different URIs and similar representation.
The paper also presents the DELTA-R change model to represent the
changes detected by the DELTA-R approach. This model bridges the gap
between resource-centric and triple-centric views of changes in linked
datasets. As a result, a single change detection mechanism will be able to
support the use cases like interlink maintenance and dataset or replica
synchronization. Additionally, the paper describes an experiment conducted
to examine the accuracy of the DELTA-R approach in detecting the changes
between two versions of a linked dataset. The result indicates that the
accuracy of DELTA-R approach outperforms the state of the art approaches
by up to 4%. It is demonstrated that the proposed more granular
classification of changes helped to identifyup to 1529 additional updated
resources compered to X.By means of a case study, we demonstrate the
support of DELTA-R approach and change model for an interlink
maintenance use case. The result shows that 100% of the broken interlinks
were repaired between DBpedia person snapshot 3.7 and Freebase
Saffron: a data value assessment tool for quantifying the value of data assets
Data has become an indispensable commodity and it is the
basis for many products and services. It has become increasingly important to understand the value of this data in order to be able to exploit it
and reap the full benefits. Yet, many businesses and entities are simply
hoarding data without understanding its true potential. We here present
Saffron; a Data Value Assessment Tool that enables the quantification of
the value of data assets based on a number of different data value dimensions. Based on the Data Value Vocabulary (DaVe), Saffron enables the
extensible representation of the calculated value of data assets, whilst
also catering for the subjective and contextual nature of data value. The
tool exploits semantic technologies in order to provide traceable explanations of the calculated data value. Saffron therefore provides the first
step towards the efficient and effective exploitation of data assets
Understanding information professionals: a survey on the quality of Linked Data sources for digital libraries
In this paper we provide an in-depth analysis of a survey
related to Information Professionals (IPs) experiences with Linked Data
quality. We discuss and highlight shortcomings in linked data sources
following a survey related to the quality issues IPs find when using such
sources for their daily tasks such as metadata creation
Semantic data ingestion for intelligent, value-driven big data analytics
In this position paper we describe a conceptual
model for intelligent Big Data analytics based on both semantic
and machine learning AI techniques (called AI ensembles). These
processes are linked to business outcomes by explicitly modelling
data value and using semantic technologies as the underlying
mode for communication between the diverse processes and
organisations creating AI ensembles. Furthermore, we show
how data governance can direct and enhance these ensembles
by providing recommendations and insights that to ensure the
output generated produces the highest possible value for the
organisation
Assessing the quality of geospatial linked data – experiences from Ordnance Survey Ireland (OSi)
Ordnance Survey Ireland (OSi) is Ireland’s national mapping agency
that is responsible for the digitisation of the island’s infrastructure in terms of
mapping. Generating data from various sensors (e.g. spatial sensors), OSi build
its knowledge in the Prime2 framework, a subset of which is transformed into
geo-Linked Data. In this paper we discuss how the quality of the generated
sematic data fares against datasets in the LOD cloud. We set up Luzzu, a scalable
Linked Data quality assessment framework, in the OSi pipeline to continuously
assess produced data in order to tackle any quality problems prior to publishing
Milan: automatic generation of R2RML mappings
Milan automatically generates R2RML mappings between a
source relational database and a target ontology, using a novel multi-level
algorithms. It address real world inter-model semantic gap by resolving
naming conflicts, structural and semantic heterogeneity, thus enabling
high fidelity mapping generation for realistic databases. Despite the importance of mappings for interoperability across relational databases and
ontologies, a labour and expertise-intensive task, the current state of the
art has achieved only limited automation. The paper describes an experimental evaluation of Milan with respect to the state of the art systems
using the RODI benchmarking tool which shows that Milan outperforms
all systems in all categorie
A distributed intelligent network based on CORBA and SCTP
The telecommunications services marketplace is undergoing radical change due to the rapid convergence and evolution of telecommunications and computing technologies. Traditionally telecommunications service providers’ ability to deliver network services has been through Intelligent Network (IN) platforms. The IN may be characterised as envisioning centralised processing of distributed service requests from a limited number of quasi-proprietary nodes with inflexible connections to the network management system and third party networks. The nodes are inter-linked by the operator’s highly reliable but expensive SS.7 network. To leverage this technology as the core of new multi-media services several key technical challenges must be overcome. These include: integration of the IN with new technologies for service delivery, enhanced integration with network management services, enabling third party service providers and reducing operating costs by using more general-purpose computing and networking equipment. In this thesis we present a general architecture that defines the framework and techniques required to realise an open, flexible, middleware (CORBA)-based distributed intelligent network (DIN). This extensible architecture naturally encapsulates the full range of traditional service network technologies, for example IN (fixed network), GSM-MAP and CAMEL. Fundamental to this architecture are mechanisms for inter-working with the existing IN infrastructure, to enable gradual migration within a domain and inter-working between IN and DIN domains. The DIN architecture compliments current research on third party service provision, service management and integration Internet-based servers. Given the dependence of such a distributed service platform on the transport network that links computational nodes, this thesis also includes a detailed study of the emergent IP-based telecommunications transport protocol of choice, Stream Control Transmission Protocol (SCTP). In order to comply with the rigorous performance constraints of this domain, prototyping, simulation and analytic modelling of the DIN based on SCTP have been carried out. This includes the first detailed analysis of the operation of SCTP congestion controls under a variety of network conditions leading to a number of suggested improvements in the operation of the protocol. Finally we describe a new analytic framework for dimensioning networks with competing multi-homed SCTP flows in a DIN. This framework can be used for any multi-homed SCTP network e.g. one transporting SIP or HTTP
- …