1,653 research outputs found
Modular P2P-Based Approach for RDF Data Storage and Retrieval
International audienceOne of the key elements of the Semantic Web is the Resource Description Framework (RDF). Efficient storage and retrieval of RDF data in large scale settings is still challenging and existing solutions are monolithic and thus not very flexible from a software engineering point of view. In this paper, we propose a modular system, based on the scalable Content-Addressable Network (CAN), which gives the possibility to store and retrieve RDF data in large scale settings. We identified and isolated key components forming such system in our design architecture. We have evaluated our system using the Grid'5000 testbed over 300 peers on 75 machines and the outcome of these micro-benchmarks show interesting results in terms of scalability and concurrent queries
SmartORC: smart orchestration of resources in the compute continuum
The promise of the compute continuum is to present applications with a flexible and transparent view of the resources in the Internet of Things–Edge–Cloud ecosystem. However, such a promise requires tackling complex challenges to maximize the benefits of both the cloud and the edge. Challenges include managing a highly distributed platform, matching services and resources, harnessing resource heterogeneity, and adapting the deployment of services to the changes in resources and applications. In this study, we present SmartORC, a comprehensive set of components designed to provide a complete framework for managing resources and applications in the Compute Continuum. Along with the description of all the SmartORC subcomponents, we have also provided the results of an evaluation aimed at showcasing the framework's capability
Strange bedfellows? Keyword and conceptual search unite to make sense of relevant ESI in electronic discovery
In the brief history of electronic discovery, the latter part of the twentieth century witnessed the
demise of paper by a digital hero that emancipated the content of paper documents with OCR
and TIFF. This technology added a third dimension to the realm of 2D paper document review
and production that lead to a sea change in discovery methods. By many accounts what we have
before us is a three-stage evolution from paper to digital to clustering in order to overcome the
problems of volume and complexity of ESI. The intent of this position paper is to describe the
development of the digital hero and methodology that is emancipating the content and context of
ESI – conceptual search that spans file formats, languages and technique, and includes keyword
search on a common, shared index
Large Scale Hierarchical K-Means Based Image Retrieval With MapReduce
Image retrieval remains one of the most heavily researched areas in Computer Vision. Image retrieval methods have been used in autonomous vehicle localization research, object recognition applications, and commercially in projects such as Google Glass. Current methods for image retrieval become problematic when implemented on image datasets that can easily reach billions of images. In order to process these growing datasets, we distribute the necessary computation for image retrieval among a cluster of machines using Apache Hadoop. While there are many techniques for image retrieval, we focus on systems that use Hierarchical K-Means Trees. Successful image retrieval systems based on Hierarchical K-Means Trees have been built using the tree as a Visual Vocabulary to build an Inverted File Index and implementing a Bag of Words retrieval approach, or by building the tree as a Full Representation of every image in the database and implementing a K-Nearest Neighbor voting scheme for retrieval. Both approaches involve different levels of approximation, and each has strengths and weaknesses that must be weighed in accordance with the needs of the application. Both approaches are implemented with MapReduce, for the first time, and compared in terms of image retrieval precision, index creation run-time, and image retrieval throughput. Experiments that include up to 2 million images running on 20 virtual machines are shown
A Modular Design for Geo-Distributed Querying: Work in Progress Report
International audienceMost distributed storage systems provide limited abilities for querying data by attributes other than their primary keys. Supporting efficient search on secondary attributes is challenging as applications pose varying requirements to query processing systems, and no single system design can be suitable for all needs. In this paper, we show how to overcome these challenges in order to extend distributed data stores to support queries on secondary attributes. We propose a modular architecture that is flexible and allows query processing systems to make trade-offs according to different use case requirements. We describe adap-tive mechanisms that make use of this flexibility to enable query processing systems to dynamically adjust to query and write operation workloads
1st INCF Workshop on Sustainability of Neuroscience Databases
The goal of the workshop was to discuss issues related to the sustainability of neuroscience databases, identify problems and propose solutions, and formulate recommendations to the INCF. The report summarizes the discussions of invited participants from the neuroinformatics community as well as from other disciplines where sustainability issues have already been approached. The recommendations for the INCF involve rating, ranking, and supporting database sustainability
Privacy-preserving efficient searchable encryption
Data storage and computation outsourcing to third-party managed data centers,
in environments such as Cloud Computing, is increasingly being adopted
by individuals, organizations, and governments. However, as cloud-based outsourcing
models expand to society-critical data and services, the lack of effective
and independent control over security and privacy conditions in such settings
presents significant challenges.
An interesting solution to these issues is to perform computations on encrypted
data, directly in the outsourcing servers. Such an approach benefits
from not requiring major data transfers and decryptions, increasing performance
and scalability of operations. Searching operations, an important application
case when cloud-backed repositories increase in number and size, are good examples
where security, efficiency, and precision are relevant requisites. Yet existing
proposals for searching encrypted data are still limited from multiple perspectives,
including usability, query expressiveness, and client-side performance and
scalability.
This thesis focuses on the design and evaluation of mechanisms for searching
encrypted data with improved efficiency, scalability, and usability. There are
two particular concerns addressed in the thesis: on one hand, the thesis aims at
supporting multiple media formats, especially text, images, and multimodal data
(i.e. data with multiple media formats simultaneously); on the other hand the
thesis addresses client-side overhead, and how it can be minimized in order to
support client applications executing in both high-performance desktop devices
and resource-constrained mobile devices.
From the research performed to address these issues, three core contributions
were developed and are presented in the thesis: (i) CloudCryptoSearch, a middleware
system for storing and searching text documents with privacy guarantees,
while supporting multiple modes of deployment (user device, local proxy, or computational cloud) and exploring different tradeoffs between security, usability, and performance; (ii) a novel framework for efficiently searching encrypted images
based on IES-CBIR, an Image Encryption Scheme with Content-Based Image
Retrieval properties that we also propose and evaluate; (iii) MIE, a Multimodal
Indexable Encryption distributed middleware that allows storing, sharing, and
searching encrypted multimodal data while minimizing client-side overhead and
supporting both desktop and mobile devices
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
- …