164 research outputs found

    Efficient and secure document similarity search cloud utilizing mapreduce

    Get PDF
    Document similarity has important real life applications such as finding duplicate web sites and identifying plagiarism. While the basic techniques such as k-similarity algorithms have been long known, overwhelming amount of data, being collected such as in big data setting, calls for novel algorithms to find highly similar documents in reasonably short amount of time. In particular, pairwise comparison of documents sharing a common feature, necessitates prohibitively high storage and computation power. The wide spread availability of cloud computing provides users easy access to high storage and processing power. Furthermore, outsourcing their data to the cloud guarantees reliability and availability for their data while privacy and security concerns are not always properly addressed. This leads to the problem of protecting the privacy of sensitive data against adversaries including the cloud operator. Generally, traditional document similarity algorithms tend to compare all the documents in a data set sharing same terms (words) with query document. In our work, we propose a new filtering technique that works on plaintext data, which decreases the number of comparisons between the query set and the search set to find highly similar documents. The technique, referred as ZOLIP algorithm, is efficient and scalable, but does not provide security. We also design and implement three secure similarity search algorithms for text documents, namely Secure Sketch Search, Secure Minhash Search and Secure ZOLIP. The first algorithm utilizes locality sensitive hashing techniques and cosine similarity. While the second algorithm uses the Minhash Algorithm, the last one uses the encrypted ZOLIP Signature, which is the secure version of the ZOLIP algorithm. We utilize the Hadoop distributed file system and the MapReduce parallel programming model to scale our techniques to big data setting. Our experimental results on real data show that some of the proposed methods perform better than the previous work in the literature in terms of the number of joins, and therefore, speed

    A Framework for Uncertain Cloud Data Security and Recovery Based on Hybrid Multi-User Medical Decision Learning Patterns

    Get PDF
    Machine learning has been supporting real-time cloud based medical computing systems. However, most of the computing servers are independent of data security and recovery scheme in multiple virtual machines due to high computing cost and time. Also, this cloud based medical applications require static security parameters for cloud data security. Cloud based medical applications require multiple servers to store medical records or machine learning patterns for decision making. Due to high Uncertain computational memory and time, these cloud systems require an efficient data security framework to provide strong data access control among the multiple users. In this work, a hybrid cloud data security framework is developed to improve the data security on the large machine learning patterns in real-time cloud computing environment. This work is implemented in two phases’ i.e. data replication phase and multi-user data access security phase. Initially, machine decision patterns are replicated among the multiple servers for Uncertain data recovering phase. In the multi-access cloud data security framework, a hybrid multi-access key based data encryption and decryption model is implemented on the large machine learning medical patterns for data recovery and security process. Experimental results proved that the present two-phase data recovering, and security framework has better computational efficiency than the conventional approaches on large medical decision patterns

    On the security of NoSQL cloud database services

    Get PDF
    Processing a vast volume of data generated by web, mobile and Internet-enabled devices, necessitates a scalable and flexible data management system. Database-as-a-Service (DBaaS) is a new cloud computing paradigm, promising a cost-effective and scalable, fully-managed database functionality meeting the requirements of online data processing. Although DBaaS offers many benefits it also introduces new threats and vulnerabilities. While many traditional data processing threats remain, DBaaS introduces new challenges such as confidentiality violation and information leakage in the presence of privileged malicious insiders and adds new dimension to the data security. We address the problem of building a secure DBaaS for a public cloud infrastructure where, the Cloud Service Provider (CSP) is not completely trusted by the data owner. We present a high level description of several architectures combining modern cryptographic primitives for achieving this goal. A novel searchable security scheme is proposed to leverage secure query processing in presence of a malicious cloud insider without disclosing sensitive information. A holistic database security scheme comprised of data confidentiality and information leakage prevention is proposed in this dissertation. The main contributions of our work are: (i) A searchable security scheme for non-relational databases of the cloud DBaaS; (ii) Leakage minimization in the untrusted cloud. The analysis of experiments that employ a set of established cryptographic techniques to protect databases and minimize information leakage, proves that the performance of the proposed solution is bounded by communication cost rather than by the cryptographic computational effort

    TREDIS – A Trusted Full-Fledged SGX-Enabled REDIS Solution

    Get PDF
    Currently, offloading storage and processing capacity to cloud servers is a growing trend among web-enabled services managing big datasets. This happens because high storage capacity and powerful processors are expensive, whilst cloud services provide cheaper, ongoing, elastic, and reliable solutions. The problem with this cloud-based out sourced solutions are that they are highly accessible through the Internet, which is good, but therefore can be considerably exposed to attacks, out of users’ control. By exploring subtle vulnerabilities present in cloud-enabled applications, management functions, op erating systems and hypervisors, an attacker may compromise the supported systems, thus compromising the privacy of sensitive user data hosted and managed in it. These attacks can be motivated by malicious purposes such as espionage, blackmail, identity theft, or harassment. A solution to this problem is processing data without exposing it to untrusted components, such as vulnerable OS components, which might be compromised by an attacker. In this thesis, we do a research on existent technologies capable of enabling appli cations to trusted environments, in order to adopt such approaches to our solution as a way to help deploy unmodified applications on top of Intel-SGX, with overheads com parable to applications designed to use this kind of technology, and also conducting an experimental evaluation to better understand how they impact our system. Thus, we present TREDIS - a Trusted Full-Fledged REDIS Key-Value Store solution, implemented as a full-fledged solution to be offered as a Trusted Cloud-enabled Platform as a Service, which includes the possibility to support a secure REDIS-cluster architecture supported by docker-virtualized services running in SGX-enabled instances, with operations run ning on always-encrypted in-memory datasets.A transição de suporte de aplicações com armazenamento e processamento em servidores cloud é uma tendência que tem vindo a aumentar, principalmente quando se precisam de gerir grandes conjuntos de dados. Comparativamente a soluções com licenciamento privado, as soluções de computação e armazenamento de dados em nuvens de serviços são capazes de oferecer opções mais baratas, de alta disponibilidade, elásticas e relativa mente confiáveis. Estas soluções fornecidas por terceiros são facilmente acessíveis através da Internet, sendo operadas em regime de outsourcing da sua operação, o que é bom, mas que por isso ficam consideravelmente expostos a ataques e fora do controle dos utiliza dores em relação às reais condições de confiabilidade, segurança e privacidade de dados. Ao explorar subtilmente vulnerabilidades presentes nas aplicações, funções de sistemas operativos (SOs), bibliotecas de virtualização de serviços de SOs ou hipervisores, um ata cante pode comprometer os sistemas e quebrar a privacidade de dados sensíveis. Estes ataques podem ser motivados por fins maliciosos como espionagem, chantagem, roubo de identidade ou assédio e podem ser desencadeados por intrusões (a partir de atacantes externos) ou por ações maliciosas ou incorretas de atacantes internos (podendo estes atuar com privilégios de administradores de sistemas). Uma solução para este problema passa por armazenar e processar a informação sem que existam exposições face a componentes não confiáveis. Nesta dissertação estudamos e avaliamos experimentalmente diversas tecnologias que permitem a execução de aplicações com isolamento em ambientes de execução confiá vel suportados em hardware Intel-SGX, de modo a perceber melhor como funcionam e como adaptá-las à nossa solução. Para isso, realizámos uma avaliação focada na utilização dessas tecnologias com virtualização em contentores isolados executando em hardware confiável, que usámos na concepção da nossa solução. Posto isto, apresentamos a nossa solução TREDIS - um sistema Key-Value Store confiável baseado em tecnologia REDIS, com garantias de integridade da execução e de privacidade de dados, concebida para ser usada como uma "Plataforma como Serviço"para gestão e armazenamento resiliente de dados na nuvem. Isto inclui a possibilidade de suportar uma arquitetura segura com garantias de resiliência semelhantes à arquitetura de replicação em cluster na solução original REDIS, mas em que os motores de execução de nós e a proteção de memória do cluster é baseado em contentores docker isolados e virtualizados em instâncias SGX, sendo os dados mantidos sempre cifrados em memória

    Literature Survey of Big Data

    Get PDF
    Mention the topic of big data, and a person is bound to experience information overload. Indeed, it is so complex with so many terms and details that people want to run away from it. When used right, big data (BD) will help people access data they need in in real time and help managers make better decisions. The purpose of this paper is to evaluate methods, procedures, and architectures for the storage and retrieval of all Federal Aviation Administration (FAA) research, engineering, and development (RE&D) data sets, to leverage on the technology innovation and advancement opportunities in the field of BD analytics. The paper also discusses all relevant Executive Orders (EOs), laws, and Office of Management and Budget (OMB) memorandums that were written to address what federal agencies under the OMB\u2019s jurisdiction must do to comply with various aspects of BD

    Privacy-preserving Platforms for Computation on Hybrid Clouds

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    New Security Definitions, Constructions and Applications of Proxy Re-Encryption

    Get PDF
    La externalización de la gestión de la información es una práctica cada vez más común, siendo la computación en la nube (en inglés, cloud computing) el paradigma más representativo. Sin embargo, este enfoque genera también preocupación con respecto a la seguridad y privacidad debido a la inherente pérdida del control sobre los datos. Las soluciones tradicionales, principalmente basadas en la aplicación de políticas y estrategias de control de acceso, solo reducen el problema a una cuestión de confianza, que puede romperse fácilmente por los proveedores de servicio, tanto de forma accidental como intencionada. Por lo tanto, proteger la información externalizada, y al mismo tiempo, reducir la confianza que es necesario establecer con los proveedores de servicio, se convierte en un objetivo inmediato. Las soluciones basadas en criptografía son un mecanismo crucial de cara a este fin. Esta tesis está dedicada al estudio de un criptosistema llamado recifrado delegado (en inglés, proxy re-encryption), que constituye una solución práctica a este problema, tanto desde el punto de vista funcional como de eficiencia. El recifrado delegado es un tipo de cifrado de clave pública que permite delegar en una entidad la capacidad de transformar textos cifrados de una clave pública a otra, sin que pueda obtener ninguna información sobre el mensaje subyacente. Desde un punto de vista funcional, el recifrado delegado puede verse como un medio de delegación segura de acceso a información cifrada, por lo que representa un candidato natural para construir mecanismos de control de acceso criptográficos. Aparte de esto, este tipo de cifrado es, en sí mismo, de gran interés teórico, ya que sus definiciones de seguridad deben balancear al mismo tiempo la seguridad de los textos cifrados con la posibilidad de transformarlos mediante el recifrado, lo que supone una estimulante dicotomía. Las contribuciones de esta tesis siguen un enfoque transversal, ya que van desde las propias definiciones de seguridad del recifrado delegado, hasta los detalles específicos de potenciales aplicaciones, pasando por construcciones concretas

    Secure Database Outsourcing to the Cloud : Side-Channels, Counter-Measures and Trusted Execution

    Get PDF
    corecore