2,389 research outputs found

    Securing Databases from Probabilistic Inference

    Full text link
    Databases can leak confidential information when users combine query results with probabilistic data dependencies and prior knowledge. Current research offers mechanisms that either handle a limited class of dependencies or lack tractable enforcement algorithms. We propose a foundation for Database Inference Control based on ProbLog, a probabilistic logic programming language. We leverage this foundation to develop Angerona, a provably secure enforcement mechanism that prevents information leakage in the presence of probabilistic dependencies. We then provide a tractable inference algorithm for a practically relevant fragment of ProbLog. We empirically evaluate Angerona's performance showing that it scales to relevant security-critical problems.Comment: A short version of this paper has been accepted at the 30th IEEE Computer Security Foundations Symposium (CSF 2017

    Mitigating Insider Threat in Relational Database Systems

    Get PDF
    The dissertation concentrates on addressing the factors and capabilities that enable insiders to violate systems security. It focuses on modeling the accumulative knowledge that insiders get throughout legal accesses, and it concentrates on analyzing the dependencies and constraints among data items and represents them using graph-based methods. The dissertation proposes new types of Knowledge Graphs (KGs) to represent insiders\u27 knowledgebases. Furthermore, it introduces the Neural Dependency and Inference Graph (NDIG) and Constraints and Dependencies Graph (CDG) to demonstrate the dependencies and constraints among data items. The dissertation discusses in detail how insiders use knowledgebases and dependencies and constraints to get unauthorized knowledge. It suggests new approaches to predict and prevent the aforementioned threat. The proposed models use KGs, NDIG and CDG in analyzing the threat status, and leverage the effect of updates on the lifetimes of data items in insiders\u27 knowledgebases to prevent the threat without affecting the availability of data items. Furthermore, the dissertation uses the aforementioned idea in ordering the operations of concurrent tasks such that write operations that update risky data items in knowledgebases are executed before the risky data items can be used in unauthorized inferences. In addition to unauthorized knowledge, the dissertation discusses how insiders can make unauthorized modifications in sensitive data items. It introduces new approaches to build Modification Graphs that demonstrate the authorized and unauthorized data items which insiders are able to update. To prevent this threat, the dissertation provides two methods, which are hiding sensitive dependencies and denying risky write requests. In addition to traditional RDBMS, the dissertation investigates insider threat in cloud relational database systems (cloud RDMS). It discusses the vulnerabilities in the cloud computing structure that may enable insiders to launch attacks. To prevent such threats, the dissertation suggests three models and addresses the advantages and limitations of each one. To prove the correctness and the effectiveness of the proposed approaches, the dissertation uses well stated algorithms, theorems, proofs and simulations. The simulations have been executed according to various parameters that represent the different conditions and environments of executing tasks

    k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY

    Full text link

    Causal Discovery for Relational Domains: Representation, Reasoning, and Learning

    Get PDF
    Many domains are currently experiencing the growing trend to record and analyze massive, observational data sets with increasing complexity. A commonly made claim is that these data sets hold potential to transform their corresponding domains by providing previously unknown or unexpected explanations and enabling informed decision-making. However, only knowledge of the underlying causal generative process, as opposed to knowledge of associational patterns, can support such tasks. Most methods for traditional causal discovery—the development of algorithms that learn causal structure from observational data—are restricted to representations that require limiting assumptions on the form of the data. Causal discovery has almost exclusively been applied to directed graphical models of propositional data that assume a single type of entity with independence among instances. However, most real-world domains are characterized by systems that involve complex interactions among multiple types of entities. Many state-of-the-art methods in statistics and machine learning that address such complex systems focus on learning associational models, and they are oftentimes mistakenly interpreted as causal. The intersection between causal discovery and machine learning in complex systems is small. The primary objective of this thesis is to extend causal discovery to such complex systems. Specifically, I formalize a relational representation and model that can express the causal and probabilistic dependencies among the attributes of interacting, heterogeneous entities. I show that the traditional method for reasoning about statistical independence from model structure fails to accurately derive conditional independence facts from relational models. I introduce a new theory—relational d-separation—and a novel, lifted representation—the abstract ground graph—that supports a sound, complete, and computationally efficient method for algorithmically deriving conditional independencies from probabilistic models of relational data. The abstract ground graph representation also presents causal implications that enable the detection of causal direction for bivariate relational dependencies without parametric assumptions. I leverage these implications and the theoretical framework of relational d-separation to develop a sound and complete algorithm—the relational causal discovery (RCD) algorithm—that learns causal structure from relational data

    Strong and Provably Secure Database Access Control

    Full text link
    Existing SQL access control mechanisms are extremely limited. Attackers can leak information and escalate their privileges using advanced database features such as views, triggers, and integrity constraints. This is not merely a problem of vendors lagging behind the state-of-the-art. The theoretical foundations for database security lack adequate security definitions and a realistic attacker model, both of which are needed to evaluate the security of modern databases. We address these issues and present a provably secure access control mechanism that prevents attacks that defeat popular SQL database systems.Comment: A short version of this paper has been published in the proceedings of the 1st IEEE European Symposium on Security and Privacy (EuroS&P 2016

    An Effective and Efficient Inference Control System for Relational Database Queries

    Get PDF
    Protecting confidential information in relational databases while ensuring availability of public information at the same time is a demanding task. Unwanted information flows due to the reasoning capabilities of database users require sophisticated inference control mechanisms, since access control is in general not sufficient to guarantee the preservation of confidentiality. The policy-driven approach of Controlled Query Evaluation (CQE) turned out to be an effective means for controlling inferences in databases that can be modeled in a logical framework. It uses a censor function to determine whether or not the honest answer to a user query enables the user to disclose confidential information which is declared in form of a confidentiality policy. In doing so, CQE also takes answers to previous queries and the user’s background knowledge about the inner workings of the mechanism into account. Relational databases are usually modeled using first-order logic. In this context, the decision problem to be solved by the CQE censor becomes undecidable in general because the censor basically performs theorem proving over an ever growing user log. In this thesis, we develop a stateless CQE mechanism that does not need to maintain such a user log but still reaches the declarative goals of inference control. This feature comes at the price of several restrictions for the database administrator who declares the schema of the database, the security administrator who declares the information to be kept confidential, and the database user who sends queries to the database. We first investigate a scenario with quite restricted possibilities for expressing queries and confidentiality policies and propose an efficient stateless CQE mechanism. Due to the assumed restrictions, the censor function of this mechanism reduces to a simple pattern matching. Based on this case, we systematically enhance the proposed query and policy languages and investigate the respective effects on confidentiality. We suitably adapt the stateless CQE mechanism to these enhancements and formally prove the preservation of confidentiality. Finally, we develop efficient algorithmic implementations of stateless CQE, thereby showing that inference control in relational databases is feasible for actual relational database management systems under suitable restrictions

    Access Control for Data Integration in Presence of Data Dependencies

    Full text link
    International audienceDefining access control policies in a data integration scenario is a challenging task. In such a scenario typically each source specifies its local access control policy and cannot anticipate data inferences that can arise when data is integrated at the mediator level. Inferences, e.g., using functional dependencies, can allow malicious users to obtain, at the mediator level, prohibited information by linking multiple queries and thus violating the local policies. In this paper, we propose a framework, i.e., a methodology and a set of algorithms, to prevent such violations. First, we use a graph-based approach to identify sets of queries, called violating transactions, and then we propose an approach to forbid the execution of those transactions by identifying additional access control rules that should be added to the mediator. We also state the complexity of the algorithms and discuss a set of experiments we conducted by using both real and synthetic datasets. Tests also confirm the complexity and upper bounds in worst-case scenarios of the proposed algorithms

    Preventing Inferences through Data Dependencies on Sensitive Data

    Get PDF
    Simply restricting the computation to non-sensitive part of the data may lead to inferences on sensitive data through data dependencies. Inference control from data dependencies has been studied in the prior work. However, existing solutions either detect and deny queries which may lead to leakage – resulting in poor utility, or only protects against exact reconstruction of the sensitive data – resulting in poor security. In this paper, we present a novel security model called full deniability. Under this stronger security model, any information inferred about sensitive data from non-sensitive data is considered as a leakage. We describe algorithms for efficiently implementing full deniability on a given database instance with a set of data dependencies and sensitive cells. Using experiments on two different datasets, we demonstrate that our approach protects against realistic adversaries while hiding only minimal number of additional non-sensitive cells and scales well with database size and sensitive data
    • …
    corecore