570 research outputs found
Secure Remote Storage of Logs with Search Capabilities
Dissertação de Mestrado em Engenharia InformáticaAlong side with the use of cloud-based services, infrastructure and storage, the use of application logs
in business critical applications is a standard practice nowadays. Such application logs must be stored
in an accessible manner in order to used whenever needed. The debugging of these applications is a
common situation where such access is required. Frequently, part of the information contained in logs
records is sensitive.
This work proposes a new approach of storing critical logs in a cloud-based storage recurring to
searchable encryption, inverted indexing and hash chaining techniques to achieve, in a unified way, the
needed privacy, integrity and authenticity while maintaining server side searching capabilities by the logs
owner.
The designed search algorithm enables conjunctive keywords queries plus a fine-grained search
supported by field searching and nested queries, which are essential in the referred use case. To the
best of our knowledge, the proposed solution is also the first to introduce a query language that enables
complex conjunctive keywords and a fine-grained search backed by field searching and sub queries.A gerac¸ ˜ao de logs em aplicac¸ ˜oes e a sua posterior consulta s˜ao fulcrais para o funcionamento de qualquer
neg´ocio ou empresa. Estes logs podem ser usados para eventuais ac¸ ˜oes de auditoria, uma vez
que estabelecem uma baseline das operac¸ ˜oes realizadas. Servem igualmente o prop´ osito de identificar
erros, facilitar ac¸ ˜oes de debugging e diagnosticar bottlennecks de performance. Tipicamente, a maioria
da informac¸ ˜ao contida nesses logs ´e considerada sens´ıvel.
Quando estes logs s˜ao armazenados in-house, as considerac¸ ˜oes relacionadas com anonimizac¸ ˜ao,
confidencialidade e integridade s˜ao geralmente descartadas. Contudo, com o advento das plataformas
cloud e a transic¸ ˜ao quer das aplicac¸ ˜oes quer dos seus logs para estes ecossistemas, processos de
logging remotos, seguros e confidenciais surgem como um novo desafio. Adicionalmente, regulac¸ ˜ao
como a RGPD, imp˜oe que as instituic¸ ˜oes e empresas garantam o armazenamento seguro dos dados.
A forma mais comum de garantir a confidencialidade consiste na utilizac¸ ˜ao de t ´ecnicas criptogr ´aficas
para cifrar a totalidade dos dados anteriormente `a sua transfer ˆencia para o servidor remoto. Caso sejam
necess´ arias capacidades de pesquisa, a abordagem mais simples ´e a transfer ˆencia de todos os dados
cifrados para o lado do cliente, que proceder´a `a sua decifra e pesquisa sobre os dados decifrados.
Embora esta abordagem garanta a confidencialidade e privacidade dos dados, rapidamente se torna
impratic ´avel com o crescimento normal dos registos de log. Adicionalmente, esta abordagem n˜ao faz
uso do potencial total que a cloud tem para oferecer.
Com base nesta tem´ atica, esta tese prop˜oe o desenvolvimento de uma soluc¸ ˜ao de armazenamento
de logs operacionais de forma confidencial, integra e autˆ entica, fazendo uso das capacidades de armazenamento
e computac¸ ˜ao das plataformas cloud. Adicionalmente, a possibilidade de pesquisa sobre
os dados ´e mantida. Essa pesquisa ´e realizada server-side diretamente sobre os dados cifrados e sem
acesso em momento algum a dados n˜ao cifrados por parte do servidor..
SoK: Cryptographically Protected Database Search
Protected database search systems cryptographically isolate the roles of
reading from, writing to, and administering the database. This separation
limits unnecessary administrator access and protects data in the case of system
breaches. Since protected search was introduced in 2000, the area has grown
rapidly; systems are offered by academia, start-ups, and established companies.
However, there is no best protected search system or set of techniques.
Design of such systems is a balancing act between security, functionality,
performance, and usability. This challenge is made more difficult by ongoing
database specialization, as some users will want the functionality of SQL,
NoSQL, or NewSQL databases. This database evolution will continue, and the
protected search community should be able to quickly provide functionality
consistent with newly invented databases.
At the same time, the community must accurately and clearly characterize the
tradeoffs between different approaches. To address these challenges, we provide
the following contributions:
1) An identification of the important primitive operations across database
paradigms. We find there are a small number of base operations that can be used
and combined to support a large number of database paradigms.
2) An evaluation of the current state of protected search systems in
implementing these base operations. This evaluation describes the main
approaches and tradeoffs for each base operation. Furthermore, it puts
protected search in the context of unprotected search, identifying key gaps in
functionality.
3) An analysis of attacks against protected search for different base
queries.
4) A roadmap and tools for transforming a protected search system into a
protected database, including an open-source performance evaluation platform
and initial user opinions of protected search.Comment: 20 pages, to appear to IEEE Security and Privac
Rich Queries on Encrypted Data: Beyond Exact Matches
We extend the searchable symmetric encryption (SSE) protocol of [Cash et al., Crypto\u2713] adding support for range, substring, wildcard, and phrase queries, in addition to the Boolean queries supported in the original protocol. Our techniques apply to the basic single-client scenario underlying the common SSE setting as well as to the more complex Multi-Client and Outsourced Symmetric PIR extensions of [Jarecki et al., CCS\u2713]. We provide performance information based on our prototype implementation, showing the practicality and scalability of our techniques to very large databases, thus extending the performance results of [Cash et al., NDSS\u2714] to these rich and comprehensive query types
Distributed Search in Semantic Web Service Discovery
This thesis presents a framework for semantic Web Service discovery using descriptive (non-functional) service characteristics in a large-scale, multi-domain setting. The framework uses Web Ontology Language for Services (OWL-S) to design a template for describing non-functional service parameters in a way that facilitates service discovery, and presents a layered scheme for organizing ontologies used in service description. This service description scheme serves as a core for desigining the four main functions of a service directory: a template-based user interface, semantic query expansion algorithms, a two-level indexing scheme that combines Bloom filters with a Distributed Hash Table, and a distributed approach for storing service description. The service directory is, in turn, implemented as an extension of the Open Service Discovery Architecture. The search algorithms presented in this thesis are designed to maximize precision and completeness of service discovery, while the distributed design of the directory allows individual administrative domains to retain a high degree of independence and maintain access control to information about their services
Scalable and approximate privacy-preserving record linkage
Record linkage, the task of linking multiple databases with the aim to identify records
that refer to the same entity, is occurring increasingly in many application areas.
Generally, unique entity identifiers are not available in all the databases to be linked.
Therefore, record linkage requires the use of personal identifying attributes, such as
names and addresses, to identify matching records that need to be reconciled to the
same entity. Often, it is not permissible to exchange personal identifying data across
different organizations due to privacy and confidentiality concerns or regulations.
This has led to the novel research area of privacy-preserving record linkage (PPRL).
PPRL addresses the problem of how to link different databases to identify records
that correspond to the same real-world entities, without revealing the identities of
these entities or any private or confidential information to any party involved in the process, or to any external party, such as a researcher. The three key challenges that a PPRL solution in a real-world context needs to address are (1) scalability to largedatabases by efficiently conducting linkage; (2) achieving high quality of linkage through the use of approximate (string) matching and effective classification of the compared record pairs into matches (i.e. pairs of records that refer to the same entity) and non-matches (i.e. pairs of records that refer to different entities); and (3) provision
of sufficient privacy guarantees such that the interested parties only learn the actual
values of certain attributes of the records that were classified as matches, and the
process is secure with regard to any internal or external adversary.
In this thesis, we present extensive research in PPRL, where we have addressed
several gaps and problems identified in existing PPRL approaches. First, we begin
the thesis with a review of the literature and we propose a taxonomy of PPRL to characterize existing techniques. This allows us to identify gaps and research directions.
In the remainder of the thesis, we address several of the identified shortcomings.
One main shortcoming we address is a framework for empirical and comparative
evaluation of different PPRL solutions, which has not been studied in the literature
so far. Second, we propose several novel algorithms for scalable and approximate
PPRL by addressing the three main challenges of PPRL. We propose efficient private
blocking techniques, for both three-party and two-party scenarios, based on sorted
neighborhood clustering to address the scalability challenge. Following, we propose
two efficient two-party techniques for private matching and classification to address the linkage quality challenge in terms of approximate matching and effective classification. Privacy is addressed in these approaches using efficient data perturbation techniques including k-anonymous mapping, reference values, and Bloom filters.
Finally, the thesis reports on an extensive comparative evaluation of our proposed
solutions with several other state-of-the-art techniques on real-world datasets, which
shows that our solutions outperform others in terms of all three key challenges
Privacy-preserving efficient searchable encryption
Data storage and computation outsourcing to third-party managed data centers,
in environments such as Cloud Computing, is increasingly being adopted
by individuals, organizations, and governments. However, as cloud-based outsourcing
models expand to society-critical data and services, the lack of effective
and independent control over security and privacy conditions in such settings
presents significant challenges.
An interesting solution to these issues is to perform computations on encrypted
data, directly in the outsourcing servers. Such an approach benefits
from not requiring major data transfers and decryptions, increasing performance
and scalability of operations. Searching operations, an important application
case when cloud-backed repositories increase in number and size, are good examples
where security, efficiency, and precision are relevant requisites. Yet existing
proposals for searching encrypted data are still limited from multiple perspectives,
including usability, query expressiveness, and client-side performance and
scalability.
This thesis focuses on the design and evaluation of mechanisms for searching
encrypted data with improved efficiency, scalability, and usability. There are
two particular concerns addressed in the thesis: on one hand, the thesis aims at
supporting multiple media formats, especially text, images, and multimodal data
(i.e. data with multiple media formats simultaneously); on the other hand the
thesis addresses client-side overhead, and how it can be minimized in order to
support client applications executing in both high-performance desktop devices
and resource-constrained mobile devices.
From the research performed to address these issues, three core contributions
were developed and are presented in the thesis: (i) CloudCryptoSearch, a middleware
system for storing and searching text documents with privacy guarantees,
while supporting multiple modes of deployment (user device, local proxy, or computational cloud) and exploring different tradeoffs between security, usability, and performance; (ii) a novel framework for efficiently searching encrypted images
based on IES-CBIR, an Image Encryption Scheme with Content-Based Image
Retrieval properties that we also propose and evaluate; (iii) MIE, a Multimodal
Indexable Encryption distributed middleware that allows storing, sharing, and
searching encrypted multimodal data while minimizing client-side overhead and
supporting both desktop and mobile devices
- …