328 research outputs found

    Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval

    Get PDF
    The goal of this work is the computation of very compact binary hashes for image instance retrieval. Our approach has two novel contributions. The first one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a mathematical theory for computing group invariant transformations with feed-forward neural networks. NIP is able to produce compact and well-performing descriptors with visual representations extracted from convolutional neural networks. We specifically incorporate scale, translation and rotation invariances but the scheme can be extended to any arbitrary sets of transformations. We also show that using moments of increasing order throughout nesting is important. The NIP descriptors are then hashed to the target code size (32-256 bits) with a Restricted Boltzmann Machine with a novel batch-level reg-ularization scheme specifically designed for the purpose of hashing (RBMH). A thorough empirical evaluation with state-of-the-art shows that the results obtained both with the NIP descriptors and the NIP+RBMH hashes are consistently outstanding across a wide range of datasets

    Chameleon: A Secure Cloud-Enabled and Queryable System with Elastic Properties

    Get PDF
    There are two dominant themes that have become increasingly more important in our technological society. First, the recurrent use of cloud-based solutions which provide infrastructures, computation platforms and storage as services. Secondly, the use of applicational large logs for analytics and operational monitoring in critical systems. Moreover, auditing activities, debugging of applications and inspection of events generated by errors or potential unexpected operations - including those generated as alerts by intrusion detection systems - are common situations where extensive logs must be analyzed, and easy access is required. More often than not, a part of the generated logs can be deemed as sensitive, requiring a privacy-enhancing and queryable solution. In this dissertation, our main goal is to propose a novel approach of storing encrypted critical data in an elastic and scalable cloud-based storage, focusing on handling JSONbased ciphered documents. To this end, we make use of Searchable and Homomorphic Encryption methods to allow operations on the ciphered documents. Additionally, our solution allows for the user to be near oblivious to our system’s internals, providing transparency while in use. The achieved end goal is a unified middleware system capable of providing improved system usability, privacy, and rich querying over the data. This previously mentioned objective is addressed while maintaining server-side auditable logs, allowing for searchable capabilities by the log owner or authorized users, with integrity and authenticity proofs. Our proposed solution, named Chameleon, provides rich querying facilities on ciphered data - including conjunctive keyword, ordering correlation and boolean queries - while supporting field searching and nested aggregations. The aforementioned operations allow our solution to provide data analytics upon ciphered JSON documents, using Elasticsearch as our storage and search engine.O uso recorrente de soluções baseadas em nuvem tornaram-se cada vez mais importantes na nossa sociedade. Tais soluções fornecem infraestruturas, computação e armazenamento como serviços, para alem do uso de logs volumosos de sistemas e aplicações para análise e monitoramento operacional em sistemas críticos. Atividades de auditoria, debugging de aplicações ou inspeção de eventos gerados por erros ou possíveis operações inesperadas - incluindo alertas por sistemas de detecção de intrusão - são situações comuns onde logs extensos devem ser analisados com facilidade. Frequentemente, parte dos logs gerados podem ser considerados confidenciais, exigindo uma solução que permite manter a confidencialidades dos dados durante procuras. Nesta dissertação, o principal objetivo é propor uma nova abordagem de armazenar logs críticos num armazenamento elástico e escalável baseado na cloud. A solução proposta suporta documentos JSON encriptados, fazendo uso de Searchable Encryption e métodos de criptografia homomórfica com provas de integridade e autenticação. O objetivo alcançado é um sistema de middleware unificado capaz de fornecer privacidade, integridade e autenticidade, mantendo registos auditáveis do lado do servidor e permitindo pesquisas pelo proprietário dos logs ou usuários autorizados. A solução proposta, Chameleon, visa fornecer recursos de consulta atuando em cima de dados cifrados - incluindo queries conjuntivas, de ordenação e booleanas - suportando pesquisas de campo e agregações aninhadas. As operações suportadas permitem à nossa solução suportar data analytics sobre documentos JSON cifrados, utilizando o Elasticsearch como armazenamento e motor de busca

    PCD

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Page 96 blank. Cataloged from PDF version of thesis.Includes bibliographical references (p. 87-95).The security of systems can often be expressed as ensuring that some property is maintained at every step of a distributed computation conducted by untrusted parties. Special cases include integrity of programs running on untrusted platforms, various forms of confidentiality and side-channel resilience, and domain-specific invariants. We propose a new approach, proof-carrying data (PCD), which sidesteps the threat of faults and leakage by reasoning about properties of a computation's output data, regardless of the process that produced it. In PCD, the system designer prescribes the desired properties of a computation's outputs. Corresponding proofs are attached to every message flowing through the system, and are mutually verified by the system's components. Each such proof attests that the message's data and all of its history comply with the prescribed properties. We construct a general protocol compiler that generates, propagates, and verifies such proofs of compliance, while preserving the dynamics and efficiency of the original computation. Our main technical tool is the cryptographic construction of short non-interactive arguments (computationally-sound proofs) for statements whose truth depends on "hearsay evidence": previous arguments about other statements. To this end, we attain a particularly strong proof-of-knowledge property. We realize the above, under standard cryptographic assumptions, in a model where the prover has blackbox access to some simple functionality - essentially, a signature card.by Alessandro Chiesa.M.Eng

    Finding important entities in graphs

    Get PDF
    Graphs are established as one of the most prominent means of data representation. They are composed of simple entities -- nodes and edges -- and reflect the relationship between them. Their impact extends to a broad variety of domains, e.g., biology, sociology and the Web. In these settings, much of the data value can be captured by a simple question; how can we evaluate the importance of these entities? The aim of this dissertation is to explore novel importance measures that are meaningful and can be computed efficiently on large datasets. First, we focus on the spanning edge centrality, an edge importance measure recently introduced to evaluate phylogenetic trees. We propose very efficient methods that approximate this measure in near-linear time and apply them to large graphs with millions of nodes. We demonstrate that this centrality measure is a useful tool for the analysis of networks outside its original application domain. Next, we turn to importance measures for nodes and propose the absorbing random walk centrality. This measure evaluates a group of nodes in a graph according to how central they are with respect to a set of query nodes. Specifically, given a query set and a candidate group of nodes, we start random walks from the queries and measure their length until they reach one of the candidates. The most central group of nodes will collectively minimize the expected length of these random walks. We prove several computational properties of this measure and provide an algorithm, whose solutions offer an approximation guarantee. Additionally, we develop efficient heuristics that allow us to use this importance measure in large datasets. Finally, we consider graphs in which each node is assigned a set of attributes. We define an important connected subgraph to be one for which the total weight of its edges is small, while the number of attributes covered by its nodes is large. To select such an important subgraph, we develop an efficient approximation algorithm based on the primal-dual schema

    Redesigning the Structure and Access Paths of Databases for Effective and Efficient Query Processing

    Get PDF
    Many database users are not familiar with formal query languages, the concept of schema, or the exact content of their database. Thus, it is challenging for these users to formulate their information needs over semi-structured and structured databases. To address this problem, researchers have proposed usable query interfaces over which users can formulate their information needs without knowing about formal query languages, schema or the exact content of the database. Although the mentioned interfaces increase the us-ability of the databases, they inherently suffer from low effectiveness and efficiency. The recent growth in databases’ content size and schema complexity only exacerbates this problem. In this dissertation, we present a set of approaches to redesign the components of database management systems to improve the effectiveness and efficiency of query processing. We present theoretical and empirical results on the impact of database size and schema complexity on the effectiveness of keyword query search. Based on these results, we propose a system that answers keyword queries more effectively. Further-more, we present an online learning method that improves the response time of query processing over large databases

    Advanced Cryptographic Techniques for Protecting Log Data

    Get PDF
    This thesis examines cryptographic techniques providing security for computer log files. It focuses on ensuring authenticity and integrity, i.e. the properties of having been created by a specific entity and being unmodified. Confidentiality, the property of being unknown to unauthorized entities, will be considered, too, but with less emphasis. Computer log files are recordings of actions performed and events encountered in computer systems. While the complexity of computer systems is steadily growing, it is increasingly difficult to predict how a given system will behave under certain conditions, or to retrospectively reconstruct and explain which events and conditions led to a specific behavior. Computer log files help to mitigate the problem of retracing a system’s behavior retrospectively by providing a (usually chronological) view of events and actions encountered in a system. Authenticity and integrity of computer log files are widely recognized security requirements, see e.g. [Latham, ed., "Department of Defense Trusted Computer System Evaluation Criteria", 1985, p. 10], [Kent and Souppaya, "Guide to Computer Security Log Management", NIST Special Publication 800-92, 2006, Section 2.3.2], [Guttman and Roback, "An Introduction to Computer Security: The NIST Handbook", superseded NIST Special Publication 800-12, 1995, Section 18.3.1], [Nieles et al., "An Introduction to Information Security" , NIST Special Publication 800-12, 2017, Section 9.3], [Common Criteria Editorial Board, ed., "Common Criteria for Information Technology Security Evaluation", Part 2, Section 8.6]. Two commonly cited ways to ensure integrity of log files are to store log data on so-called write-once-read-many-times (WORM) drives and to immediately print log records on a continuous-feed printer. This guarantees that log data cannot be retroactively modified by an attacker without physical access to the storage medium. However, such special-purpose hardware may not always be a viable option for the application at hand, for example because it may be too costly. In such cases, the integrity and authenticity of log records must be ensured via other means, e.g. with cryptographic techniques. Although these techniques cannot prevent the modification of log data, they can offer strong guarantees that modifications will be detectable, while being implementable in software. Furthermore, cryptography can be used to achieve public verifiability of log files, which may be needed in applications that have strong transparency requirements. Cryptographic techniques can even be used in addition to hardware solutions, providing protection against attackers who do have physical access to the logging hardware, such as insiders. Cryptographic schemes for protecting stored log data need to be resilient against attackers who obtain control over the computer storing the log data. If this computer operates in a standalone fashion, it is an absolute requirement for the cryptographic schemes to offer security even in the event of a key compromise. As this is impossible with standard cryptographic tools, cryptographic solutions for protecting log data typically make use of forward-secure schemes, guaranteeing that changes to log data recorded in the past can be detected. Such schemes use a sequence of authentication keys instead of a single one, where previous keys cannot be computed efficiently from latter ones. This thesis considers the following requirements for, and desirable features of, cryptographic logging schemes: 1) security, i.e. the ability to reliably detect violations of integrity and authenticity, including detection of log truncations, 2) efficiency regarding both computational and storage overhead, 3) robustness, i.e. the ability to verify unmodified log entries even if others have been illicitly changed, and 4) verifiability of excerpts, including checking an excerpt for omissions. The goals of this thesis are to devise new techniques for the construction of cryptographic schemes that provide security for computer log files, to give concrete constructions of such schemes, to develop new models that can accurately capture the security guarantees offered by the new schemes, as well as to examine the security of previously published schemes. This thesis demands that cryptographic schemes for securely storing log data must be able to detect if log entries have been deleted from a log file. A special case of deletion is log truncation, where a continuous subsequence of log records from the end of the log file is deleted. Obtaining truncation resistance, i.e. the ability to detect truncations, is one of the major difficulties when designing cryptographic logging schemes. This thesis alleviates this problem by introducing a novel technique to detect log truncations without the help of third parties or designated logging hardware. Moreover, this work presents new formal security notions capturing truncation resistance. The technique mentioned above is applied to obtain cryptographic logging schemes which can be shown to satisfy these notions under mild assumptions, making them the first schemes with formally proven truncation security. Furthermore, this thesis develops a cryptographic scheme for the protection of log files which can support the creation of excerpts. For this thesis, an excerpt is a (not necessarily contiguous) subsequence of records from a log file. Excerpts created with the scheme presented in this thesis can be publicly checked for integrity and authenticity (as explained above) as well as for completeness, i.e. the property that no relevant log entry has been omitted from the excerpt. Excerpts provide a natural way to preserve the confidentiality of information that is contained in a log file, but not of interest for a specific public analysis of the log file, enabling the owner of the log file to meet confidentiality and transparency requirements at the same time. The scheme demonstrates and exemplifies the technique for obtaining truncation security mentioned above. Since cryptographic techniques to safeguard log files usually require authenticating log entries individually, some researchers [Ma and Tsudik, "A New Approach to Secure Logging", LNCS 5094, 2008; Ma and Tsudik, "A New Approach to Secure Logging", ACM TOS 2009; Yavuz and Peng, "BAF: An Efficient Publicly Verifiable Secure Audit Logging Scheme for Distributed Systems", ACSAC 2009] have proposed using aggregatable signatures [Boneh et al., "Aggregate and Verifiably Encrypted Signatures from Bilinear Maps", EUROCRYPT 2003] in order to reduce the overhead in storage space incurred by using such a cryptographic scheme. Aggregation of signatures refers to some “combination” of any number of signatures (for distinct or equal messages, by distinct or identical signers) into an “aggregate” signature. The size of the aggregate signature should be less than the total of the sizes of the orginal signatures, ideally the size of one of the original signatures. Using aggregation of signatures in applications that require storing or transmitting a large number of signatures (such as the storage of log records) can lead to significant reductions in the use of storage space and bandwidth. However, aggregating the signatures for all log records into a single signature will cause some fragility: The modification of a single log entry will render the aggregate signature invalid, preventing the cryptographic verification of any part of the log file. However, being able to distinguish manipulated log entries from non-manipulated ones may be of importance for after-the-fact investigations. This thesis addresses this issue by presenting a new technique providing a trade-off between storage overhead and robustness, i.e. the ability to tolerate some modifications to the log file while preserving the cryptographic verifiability of unmodified log entries. This robustness is achieved by the use of a special kind of aggregate signatures (called fault-tolerant aggregate signatures), which contain some redundancy. The construction makes use of combinatorial methods guaranteeing that if the number of errors is below a certain threshold, then there will be enough redundancy to identify and verify the non-modified log entries. Finally, this thesis presents a total of four attacks on three different schemes intended for securely storing log files presented in the literature [Yavuz et al., "Efficient, Compromise Resilient and Append-Only Cryptographic Schemes for Secure Audit Logging", Financial Cryptography 2012; Ma, "Practical Forward Secure Sequential Aggregate Signatures", ASIACCS 2008]. The attacks allow for virtually arbitrary log file forgeries or even recovery of the secret key used for authenticating the log file, which could then be used for mostly arbitrary log file forgeries, too. All of these attacks exploit weaknesses of the specific schemes. Three of the attacks presented here contradict the security properties of the schemes claimed and supposedly proven by the respective authors. This thesis briefly discusses these proofs and points out their flaws. The fourth attack presented here is outside of the security model considered by the scheme’s authors, but nonetheless presents a realistic threat. In summary, this thesis advances the scientific state-of-the-art with regard to providing security for computer log files in a number of ways: by introducing a new technique for obtaining security against log truncations, by providing the first scheme where excerpts from log files can be verified for completeness, by describing the first scheme that can achieve some notion of robustness while being able to aggregate log record signatures, and by analyzing the security of previously proposed schemes

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum
    corecore