1,795 research outputs found
On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark
Querying very large RDF data sets in an efficient manner requires a
sophisticated distribution strategy. Several innovative solutions have recently
been proposed for optimizing data distribution with predefined query workloads.
This paper presents an in-depth analysis and experimental comparison of five
representative and complementary distribution approaches. For achieving fair
experimental results, we are using Apache Spark as a common parallel computing
framework by rewriting the concerned algorithms using the Spark API. Spark
provides guarantees in terms of fault tolerance, high availability and
scalability which are essential in such systems. Our different implementations
aim to highlight the fundamental implementation-independent characteristics of
each approach in terms of data preparation, load balancing, data replication
and to some extent to query answering cost and performance. The presented
measures are obtained by testing each system on one synthetic and one
real-world data set over query workloads with differing characteristics and
different partitioning constraints.Comment: 16 pages, 3 figure
Fairness Testing: Testing Software for Discrimination
This paper defines software fairness and discrimination and develops a
testing-based method for measuring if and how much software discriminates,
focusing on causality in discriminatory behavior. Evidence of software
discrimination has been found in modern software systems that recommend
criminal sentences, grant access to financial products, and determine who is
allowed to participate in promotions. Our approach, Themis, generates efficient
test suites to measure discrimination. Given a schema describing valid system
inputs, Themis generates discrimination tests automatically and does not
require an oracle. We evaluate Themis on 20 software systems, 12 of which come
from prior work with explicit focus on avoiding discrimination. We find that
(1) Themis is effective at discovering software discrimination, (2)
state-of-the-art techniques for removing discrimination from algorithms fail in
many situations, at times discriminating against as much as 98% of an input
subdomain, (3) Themis optimizations are effective at producing efficient test
suites for measuring discrimination, and (4) Themis is more efficient on
systems that exhibit more discrimination. We thus demonstrate that fairness
testing is a critical aspect of the software development cycle in domains with
possible discrimination and provide initial tools for measuring software
discrimination.Comment: Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness
Testing: Testing Software for Discrimination. In Proceedings of 2017 11th
Joint Meeting of the European Software Engineering Conference and the ACM
SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE),
Paderborn, Germany, September 4-8, 2017 (ESEC/FSE'17).
https://doi.org/10.1145/3106237.3106277, ESEC/FSE, 201
Modeling views in the layered view model for XML using UML
In data engineering, view formalisms are used to provide flexibility to users and user applications by allowing them to extract and elaborate data from the stored data sources. Conversely, since the introduction of Extensible Markup Language (XML), it is fast emerging as the dominant standard for storing, describing, and interchanging data among various web and heterogeneous data sources. In combination with XML Schema, XML provides rich facilities for defining and constraining user-defined data semantics and properties, a feature that is unique to XML. In this context, it is interesting to investigate traditional database features, such as view models and view design techniques for XML. However, traditional view formalisms are strongly coupled to the data language and its syntax, thus it proves to be a difficult task to support views in the case of semi-structured data models. Therefore, in this paper we propose a Layered View Model (LVM) for XML with conceptual and schemata extensions. Here our work is three-fold; first we propose an approach to separate the implementation and conceptual aspects of the views that provides a clear separation of concerns, thus, allowing analysis and design of views to be separated from their implementation. Secondly, we define representations to express and construct these views at the conceptual level. Thirdly, we define a view transformation methodology for XML views in the LVM, which carries out automated transformation to a view schema and a view query expression in an appropriate query language. Also, to validate and apply the LVM concepts, methods and transformations developed, we propose a view-driven application development framework with the flexibility to develop web and database applications for XML, at varying levels of abstraction
- …