164 research outputs found
Efficient and secure document similarity search cloud utilizing mapreduce
Document similarity has important real life applications such as finding duplicate web sites and identifying plagiarism. While the basic techniques such as k-similarity algorithms have been long known, overwhelming amount of data, being collected such as in big data setting, calls for novel algorithms to find highly similar documents in reasonably short amount of time. In particular, pairwise comparison of documents sharing a common feature, necessitates prohibitively high storage and computation power. The wide spread availability of cloud computing provides users easy access to high storage and processing power. Furthermore, outsourcing their data to the cloud guarantees reliability and availability for their data while privacy and security concerns are not always properly addressed. This leads to the problem of protecting the privacy of sensitive data against adversaries including the cloud operator. Generally, traditional document similarity algorithms tend to compare all the documents in a data set sharing same terms (words) with query document. In our work, we propose a new filtering technique that works on plaintext data, which decreases the number of comparisons between the query set and the search set to find highly similar documents. The technique, referred as ZOLIP algorithm, is efficient and scalable, but does not provide security. We also design and implement three secure similarity search algorithms for text documents, namely Secure Sketch Search, Secure Minhash Search and Secure ZOLIP. The first algorithm utilizes locality sensitive hashing techniques and cosine similarity. While the second algorithm uses the Minhash Algorithm, the last one uses the encrypted ZOLIP Signature, which is the secure version of the ZOLIP algorithm. We utilize the Hadoop distributed file system and the MapReduce parallel programming model to scale our techniques to big data setting. Our experimental results on real data show that some of the proposed methods perform better than the previous work in the literature in terms of the number of joins, and therefore, speed
A Framework for Uncertain Cloud Data Security and Recovery Based on Hybrid Multi-User Medical Decision Learning Patterns
Machine learning has been supporting real-time cloud based medical computing systems. However, most of the computing servers are independent of data security and recovery scheme in multiple virtual machines due to high computing cost and time. Also, this cloud based medical applications require static security parameters for cloud data security. Cloud based medical applications require multiple servers to store medical records or machine learning patterns for decision making. Due to high Uncertain computational memory and time, these cloud systems require an efficient data security framework to provide strong data access control among the multiple users. In this work, a hybrid cloud data security framework is developed to improve the data security on the large machine learning patterns in real-time cloud computing environment. This work is implemented in two phases’ i.e. data replication phase and multi-user data access security phase. Initially, machine decision patterns are replicated among the multiple servers for Uncertain data recovering phase. In the multi-access cloud data security framework, a hybrid multi-access key based data encryption and decryption model is implemented on the large machine learning medical patterns for data recovery and security process. Experimental results proved that the present two-phase data recovering, and security framework has better computational efficiency than the conventional approaches on large medical decision patterns
On the security of NoSQL cloud database services
Processing a vast volume of data generated by web, mobile and Internet-enabled devices, necessitates a scalable and flexible data management system. Database-as-a-Service (DBaaS) is a new cloud computing paradigm, promising a cost-effective and scalable, fully-managed database functionality meeting the requirements of online data processing. Although DBaaS offers many benefits it also introduces new threats and vulnerabilities. While many traditional data processing threats remain, DBaaS introduces new challenges such as confidentiality violation and information leakage in the presence of privileged malicious insiders and adds new dimension to the data security. We address the problem of building a secure DBaaS for a public cloud infrastructure where, the Cloud Service Provider (CSP) is not completely trusted by the data owner. We present a high level description of several architectures combining modern cryptographic primitives for achieving this goal. A novel searchable security scheme is proposed to leverage secure query processing in presence of a malicious cloud insider without disclosing sensitive information. A holistic database security scheme comprised of data confidentiality and information leakage prevention is proposed in this dissertation. The main contributions of our work are: (i) A searchable security scheme for non-relational databases of the cloud DBaaS; (ii) Leakage minimization in the untrusted cloud. The analysis of experiments that employ a set of established cryptographic techniques to protect databases and minimize information leakage, proves that the performance of the proposed solution is bounded by communication cost rather than by the cryptographic computational effort
TREDIS – A Trusted Full-Fledged SGX-Enabled REDIS Solution
Currently, offloading storage and processing capacity to cloud servers is a growing
trend among web-enabled services managing big datasets. This happens because high
storage capacity and powerful processors are expensive, whilst cloud services provide
cheaper, ongoing, elastic, and reliable solutions. The problem with this cloud-based out sourced solutions are that they are highly accessible through the Internet, which is good,
but therefore can be considerably exposed to attacks, out of users’ control. By exploring
subtle vulnerabilities present in cloud-enabled applications, management functions, op erating systems and hypervisors, an attacker may compromise the supported systems,
thus compromising the privacy of sensitive user data hosted and managed in it. These
attacks can be motivated by malicious purposes such as espionage, blackmail, identity
theft, or harassment. A solution to this problem is processing data without exposing it to
untrusted components, such as vulnerable OS components, which might be compromised
by an attacker.
In this thesis, we do a research on existent technologies capable of enabling appli cations to trusted environments, in order to adopt such approaches to our solution as a
way to help deploy unmodified applications on top of Intel-SGX, with overheads com parable to applications designed to use this kind of technology, and also conducting an
experimental evaluation to better understand how they impact our system. Thus, we
present TREDIS - a Trusted Full-Fledged REDIS Key-Value Store solution, implemented
as a full-fledged solution to be offered as a Trusted Cloud-enabled Platform as a Service,
which includes the possibility to support a secure REDIS-cluster architecture supported
by docker-virtualized services running in SGX-enabled instances, with operations run ning on always-encrypted in-memory datasets.A transição de suporte de aplicações com armazenamento e processamento em servidores
cloud é uma tendência que tem vindo a aumentar, principalmente quando se precisam
de gerir grandes conjuntos de dados. Comparativamente a soluções com licenciamento
privado, as soluções de computação e armazenamento de dados em nuvens de serviços
são capazes de oferecer opções mais baratas, de alta disponibilidade, elásticas e relativa mente confiáveis. Estas soluções fornecidas por terceiros são facilmente acessíveis através
da Internet, sendo operadas em regime de outsourcing da sua operação, o que é bom, mas
que por isso ficam consideravelmente expostos a ataques e fora do controle dos utiliza dores em relação às reais condições de confiabilidade, segurança e privacidade de dados.
Ao explorar subtilmente vulnerabilidades presentes nas aplicações, funções de sistemas
operativos (SOs), bibliotecas de virtualização de serviços de SOs ou hipervisores, um ata cante pode comprometer os sistemas e quebrar a privacidade de dados sensíveis. Estes
ataques podem ser motivados por fins maliciosos como espionagem, chantagem, roubo
de identidade ou assédio e podem ser desencadeados por intrusões (a partir de atacantes
externos) ou por ações maliciosas ou incorretas de atacantes internos (podendo estes atuar
com privilégios de administradores de sistemas). Uma solução para este problema passa
por armazenar e processar a informação sem que existam exposições face a componentes
não confiáveis.
Nesta dissertação estudamos e avaliamos experimentalmente diversas tecnologias que
permitem a execução de aplicações com isolamento em ambientes de execução confiá vel suportados em hardware Intel-SGX, de modo a perceber melhor como funcionam e
como adaptá-las à nossa solução. Para isso, realizámos uma avaliação focada na utilização
dessas tecnologias com virtualização em contentores isolados executando em hardware
confiável, que usámos na concepção da nossa solução. Posto isto, apresentamos a nossa
solução TREDIS - um sistema Key-Value Store confiável baseado em tecnologia REDIS,
com garantias de integridade da execução e de privacidade de dados, concebida para
ser usada como uma "Plataforma como Serviço"para gestão e armazenamento resiliente
de dados na nuvem. Isto inclui a possibilidade de suportar uma arquitetura segura com
garantias de resiliência semelhantes à arquitetura de replicação em cluster na solução
original REDIS, mas em que os motores de execução de nós e a proteção de memória
do cluster é baseado em contentores docker isolados e virtualizados em instâncias SGX, sendo os dados mantidos sempre cifrados em memória
Literature Survey of Big Data
Mention the topic of big data, and a person is bound to experience information overload. Indeed, it is so complex with so many terms and details that people want to run away from it. When used right, big data (BD) will help people access data they need in in real time and help managers make better decisions. The purpose of this paper is to evaluate methods, procedures, and architectures for the storage and retrieval of all Federal Aviation Administration (FAA) research, engineering, and development (RE&D) data sets, to leverage on the technology innovation and advancement opportunities in the field of BD analytics. The paper also discusses all relevant Executive Orders (EOs), laws, and Office of Management and Budget (OMB) memorandums that were written to address what federal agencies under the OMB\u2019s jurisdiction must do to comply with various aspects of BD
New Security Definitions, Constructions and Applications of Proxy Re-Encryption
La externalización de la gestión de la información es una práctica cada vez más común, siendo la computación en la nube (en inglés, cloud computing) el paradigma más representativo. Sin embargo, este enfoque genera también preocupación con respecto a la seguridad y privacidad debido a la inherente pérdida del control sobre los datos. Las soluciones tradicionales, principalmente basadas en la aplicación de políticas y estrategias de control de acceso, solo reducen el problema a una cuestión de confianza, que puede romperse fácilmente por los proveedores de servicio, tanto de forma accidental como intencionada. Por lo tanto, proteger la información externalizada, y al mismo tiempo, reducir la confianza que es necesario establecer con los proveedores de servicio, se convierte en un objetivo inmediato. Las soluciones basadas en criptografía son un mecanismo crucial de cara a este fin.
Esta tesis está dedicada al estudio de un criptosistema llamado recifrado delegado (en inglés, proxy re-encryption), que constituye una solución práctica a este problema, tanto desde el punto de vista funcional como de eficiencia. El recifrado delegado es un tipo de cifrado de clave pública que permite delegar en una entidad la capacidad de transformar textos cifrados de una clave pública a otra, sin que pueda obtener ninguna información sobre el mensaje subyacente. Desde un punto de vista funcional, el recifrado delegado puede verse como un medio de delegación segura de acceso a información cifrada, por lo que representa un candidato natural para construir mecanismos de control de acceso criptográficos. Aparte de esto, este tipo de cifrado es, en sí mismo, de gran interés teórico, ya que sus definiciones de seguridad deben balancear al mismo tiempo la seguridad de los textos cifrados con la posibilidad de transformarlos mediante el recifrado, lo que supone una estimulante dicotomía.
Las contribuciones de esta tesis siguen un enfoque transversal, ya que van desde las propias definiciones de seguridad del recifrado delegado, hasta los detalles específicos de potenciales aplicaciones, pasando por construcciones concretas
- …