11,413 research outputs found
GeoYCSB: A Benchmark Framework for the Performance and Scalability Evaluation of Geospatial NoSQL Databases
The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of spatial data that data stores have to manage. Traditional relational databases reveal limitations in handling such big geospatial data, mainly due to their rigid schema requirements and limited scalability. Numerous NoSQL databases have emerged and actively serve as alternative data stores for big spatial data. This study presents a framework, called GeoYCSB, developed for benchmarking NoSQL databases with geospatial workloads. To develop GeoYCSB, we extend YCSB, a de facto benchmark framework for NoSQL systems, by integrating into its design architecture the new components necessary to support geospatial workloads. GeoYCSB supports both microbenchmarks and macrobenchmarks and facilitates the use of real datasets in both. It is extensible to evaluate any NoSQL database, provided they support spatial queries, using geospatial workloads performed on datasets of any geometric complexity. We use GeoYCSB to benchmark two leading document stores, MongoDB and Couchbase, and present the experimental results and analysis. Finally, we demonstrate the extensibility of GeoYCSB by including a new dataset consisting of complex geometries and using it to benchmark a system with a wide variety of geospatial queries: Apache Accumulo, a wide-column store, with the GeoMesa framework applied on top
Recommended from our members
An investigation into the cultural and legal factors influencing the differential prosecution rate for female genital mutilation in England and France
Female Genital Mutilation (FGM) is a problem that both England and France face. Both countries agree that FGM is a criminal offence and that it constitutes child abuse. Accordingly, each nation has taken its own distinct measures in law and policy against the practice. These approaches have produced significantly divergent outcomes, particularly in the prosecution rates of offenders, with France leading in that regard.
This thesis seeks to understand why criminal justice outcomes differ so significantly between the two nations, despite many parallels between the historical and contemporary contexts of these two Western European neighbours. In order to do this, it seeks to explore the overarching, systemic forces at play within both paradigms, what the author has termed “the Medium”. Furthermore, given that FGM within both France and England is a product of migrant communities having transported cultural practices into their new context, particular attention is paid to approaches to multiculturalism as a key aspect of the Medium for the purposes of this study. However, alongside this examination of the Medium, the study also explores the role of individual activism, and the agency of particular campaigners, termed “the Human Catalyst”. It addresses the complex interplay between the Medium and the Human Catalyst, as a means of understanding their combined influence on the divergent pictures in respect of prosecuting FGM
Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques
The rapid growth of demanding applications in domains applying multimedia
processing and machine learning has marked a new era for edge and cloud
computing. These applications involve massive data and compute-intensive tasks,
and thus, typical computing paradigms in embedded systems and data centers are
stressed to meet the worldwide demand for high performance. Concurrently, the
landscape of the semiconductor field in the last 15 years has constituted power
as a first-class design concern. As a result, the community of computing
systems is forced to find alternative design approaches to facilitate
high-performance and/or power-efficient computing. Among the examined
solutions, Approximate Computing has attracted an ever-increasing interest,
with research works applying approximations across the entire traditional
computing stack, i.e., at software, hardware, and architectural levels. Over
the last decade, there is a plethora of approximation techniques in software
(programs, frameworks, compilers, runtimes, languages), hardware (circuits,
accelerators), and architectures (processors, memories). The current article is
Part I of our comprehensive survey on Approximate Computing, and it reviews its
motivation, terminology and principles, as well it classifies and presents the
technical details of the state-of-the-art software and hardware approximation
techniques.Comment: Under Review at ACM Computing Survey
A conceptual framework for developing dashboards for big mobility data
Dashboards are an increasingly popular form of data visualization. Large, complex, and dynamic mobility data present a number of challenges in dashboard design. The overall aim for dashboard design is to improve information communication and decision making, though big mobility data in particular require considering privacy alongside size and complexity. Taking these issues into account, a gap remains between wrangling mobility data and developing meaningful dashboard output. Therefore, there is a need for a framework that bridges this gap to support the mobility dashboard development and design process. In this paper we outline a conceptual framework for mobility data dashboards that provides guidance for the development process while considering mobility data structure, volume, complexity, varied application contexts, and privacy constraints. We illustrate the proposed framework’s components and process using example mobility dashboards with varied inputs, end-users and objectives. Overall, the framework offers a basis for developers to understand how informational displays of big mobility data are determined by end-user needs as well as the types of data selection, transformation, and display available to particular mobility datasets
Verifiable Differential Privacy
Differential Privacy (DP) is often presented as a strong privacy-enhancing
technology with broad applicability and advocated as a de-facto standard for
releasing aggregate statistics on sensitive data. However, in many embodiments,
DP introduces a new attack surface: a malicious entity entrusted with releasing
statistics could manipulate the results and use the randomness of DP as a
convenient smokescreen to mask its nefariousness. Since revealing the random
noise would obviate the purpose of introducing it, the miscreant may have a
perfect alibi. To close this loophole, we introduce the idea of
\textit{Verifiable Differential Privacy}, which requires the publishing entity
to output a zero-knowledge proof that convinces an efficient verifier that the
output is both DP and reliable. Such a definition might seem unachievable, as a
verifier must validate that DP randomness was generated faithfully without
learning anything about the randomness itself. We resolve this paradox by
carefully mixing private and public randomness to compute verifiable DP
counting queries with theoretical guarantees and show that it is also practical
for real-world deployment. We also demonstrate that computational assumptions
are necessary by showing a separation between information-theoretic DP and
computational DP under our definition of verifiability
Biosimilars in Europe
This reprint examines regulatory, pricing and reimbursement issues related to the market access and uptake of off-patent biologics, biosimilars, next-generation biologics and competing innovative medicines in European countries
Secure Computations in Opportunistic Networks: An Edgelet Demonstration with a Medical Use-Case
International audienceIn this demonstration paper, we leverage the current convergence between Trusted Execution Environments and Opportunistic Networks to perform secure and privacy-preserving computations on personal devices. We call this convergence Edgelet computing and show that it can drastically change the way distributed processing over personal data is conceived. We demonstrate the pertinence of the approach through a real medical use-case being deployed in the field
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
PreFair: Privately Generating Justifiably Fair Synthetic Data
When a database is protected by Differential Privacy (DP), its usability is
limited in scope. In this scenario, generating a synthetic version of the data
that mimics the properties of the private data allows users to perform any
operation on the synthetic data, while maintaining the privacy of the original
data. Therefore, multiple works have been devoted to devising systems for DP
synthetic data generation. However, such systems may preserve or even magnify
properties of the data that make it unfair, endering the synthetic data unfit
for use. In this work, we present PreFair, a system that allows for DP fair
synthetic data generation. PreFair extends the state-of-the-art DP data
generation mechanisms by incorporating a causal fairness criterion that ensures
fair synthetic data. We adapt the notion of justifiable fairness to fit the
synthetic data generation scenario. We further study the problem of generating
DP fair synthetic data, showing its intractability and designing algorithms
that are optimal under certain assumptions. We also provide an extensive
experimental evaluation, showing that PreFair generates synthetic data that is
significantly fairer than the data generated by leading DP data generation
mechanisms, while remaining faithful to the private data.Comment: 15 pages, 11 figure
- …