2,389 research outputs found
Securing Databases from Probabilistic Inference
Databases can leak confidential information when users combine query results
with probabilistic data dependencies and prior knowledge. Current research
offers mechanisms that either handle a limited class of dependencies or lack
tractable enforcement algorithms. We propose a foundation for Database
Inference Control based on ProbLog, a probabilistic logic programming language.
We leverage this foundation to develop Angerona, a provably secure enforcement
mechanism that prevents information leakage in the presence of probabilistic
dependencies. We then provide a tractable inference algorithm for a practically
relevant fragment of ProbLog. We empirically evaluate Angerona's performance
showing that it scales to relevant security-critical problems.Comment: A short version of this paper has been accepted at the 30th IEEE
Computer Security Foundations Symposium (CSF 2017
Mitigating Insider Threat in Relational Database Systems
The dissertation concentrates on addressing the factors and capabilities that enable insiders to violate systems security. It focuses on modeling the accumulative knowledge that insiders get throughout legal accesses, and it concentrates on analyzing the dependencies and constraints among data items and represents them using graph-based methods. The dissertation proposes new types of Knowledge Graphs (KGs) to represent insiders\u27 knowledgebases. Furthermore, it introduces the Neural Dependency and Inference Graph (NDIG) and Constraints and Dependencies Graph (CDG) to demonstrate the dependencies and constraints among data items. The dissertation discusses in detail how insiders use knowledgebases and dependencies and constraints to get unauthorized knowledge. It suggests new approaches to predict and prevent the aforementioned threat. The proposed models use KGs, NDIG and CDG in analyzing the threat status, and leverage the effect of updates on the lifetimes of data items in insiders\u27 knowledgebases to prevent the threat without affecting the availability of data items. Furthermore, the dissertation uses the aforementioned idea in ordering the operations of concurrent tasks such that write operations that update risky data items in knowledgebases are executed before the risky data items can be used in unauthorized inferences. In addition to unauthorized knowledge, the dissertation discusses how insiders can make unauthorized modifications in sensitive data items. It introduces new approaches to build Modification Graphs that demonstrate the authorized and unauthorized data items which insiders are able to update. To prevent this threat, the dissertation provides two methods, which are hiding sensitive dependencies and denying risky write requests. In addition to traditional RDBMS, the dissertation investigates insider threat in cloud relational database systems (cloud RDMS). It discusses the vulnerabilities in the cloud computing structure that may enable insiders to launch attacks. To prevent such threats, the dissertation suggests three models and addresses the advantages and limitations of each one.
To prove the correctness and the effectiveness of the proposed approaches, the dissertation uses well stated algorithms, theorems, proofs and simulations. The simulations have been executed according to various parameters that represent the different conditions and environments of executing tasks
Causal Discovery for Relational Domains: Representation, Reasoning, and Learning
Many domains are currently experiencing the growing trend to record and analyze massive, observational data sets with increasing complexity. A commonly made claim is that these data sets hold potential to transform their corresponding domains by providing previously unknown or unexpected explanations and enabling informed decision-making. However, only knowledge of the underlying causal generative process, as opposed to knowledge of associational patterns, can support such tasks.
Most methods for traditional causal discovery—the development of algorithms that learn causal structure from observational data—are restricted to representations that require limiting assumptions on the form of the data. Causal discovery has almost exclusively been applied to directed graphical models of propositional data that assume a single type of entity with independence among instances. However, most real-world domains are characterized by systems that involve complex interactions among multiple types of entities. Many state-of-the-art methods in statistics and machine learning that address such complex systems focus on learning associational models, and they are oftentimes mistakenly interpreted as causal. The intersection between causal discovery and machine learning in complex systems is small.
The primary objective of this thesis is to extend causal discovery to such complex systems. Specifically, I formalize a relational representation and model that can express the causal and probabilistic dependencies among the attributes of interacting, heterogeneous entities. I show that the traditional method for reasoning about statistical independence from model structure fails to accurately derive conditional independence facts from relational models. I introduce a new theory—relational d-separation—and a novel, lifted representation—the abstract ground graph—that supports a sound, complete, and computationally efficient method for algorithmically deriving conditional independencies from probabilistic models of relational data. The abstract ground graph representation also presents causal implications that enable the detection of causal direction for bivariate relational dependencies without parametric assumptions. I leverage these implications and the theoretical framework of relational d-separation to develop a sound and complete algorithm—the relational causal discovery (RCD) algorithm—that learns causal structure from relational data
Strong and Provably Secure Database Access Control
Existing SQL access control mechanisms are extremely limited. Attackers can
leak information and escalate their privileges using advanced database features
such as views, triggers, and integrity constraints. This is not merely a
problem of vendors lagging behind the state-of-the-art. The theoretical
foundations for database security lack adequate security definitions and a
realistic attacker model, both of which are needed to evaluate the security of
modern databases. We address these issues and present a provably secure access
control mechanism that prevents attacks that defeat popular SQL database
systems.Comment: A short version of this paper has been published in the proceedings
of the 1st IEEE European Symposium on Security and Privacy (EuroS&P 2016
An Effective and Efficient Inference Control System for Relational Database Queries
Protecting confidential information in relational databases while ensuring availability of
public information at the same time is a demanding task. Unwanted information flows
due to the reasoning capabilities of database users require sophisticated inference control
mechanisms, since access control is in general not sufficient to guarantee the preservation
of confidentiality. The policy-driven approach of Controlled Query Evaluation (CQE)
turned out to be an effective means for controlling inferences in databases that can be
modeled in a logical framework. It uses a censor function to determine whether or not
the honest answer to a user query enables the user to disclose confidential information
which is declared in form of a confidentiality policy. In doing so, CQE also takes answers
to previous queries and the user’s background knowledge about the inner workings of the
mechanism into account.
Relational databases are usually modeled using first-order logic. In this context, the
decision problem to be solved by the CQE censor becomes undecidable in general because
the censor basically performs theorem proving over an ever growing user log. In this
thesis, we develop a stateless CQE mechanism that does not need to maintain such a user
log but still reaches the declarative goals of inference control. This feature comes at the
price of several restrictions for the database administrator who declares the schema of the
database, the security administrator who declares the information to be kept confidential,
and the database user who sends queries to the database.
We first investigate a scenario with quite restricted possibilities for expressing queries
and confidentiality policies and propose an efficient stateless CQE mechanism. Due to the
assumed restrictions, the censor function of this mechanism reduces to a simple pattern
matching. Based on this case, we systematically enhance the proposed query and policy
languages and investigate the respective effects on confidentiality. We suitably adapt the
stateless CQE mechanism to these enhancements and formally prove the preservation
of confidentiality. Finally, we develop efficient algorithmic implementations of stateless
CQE, thereby showing that inference control in relational databases is feasible for actual
relational database management systems under suitable restrictions
Access Control for Data Integration in Presence of Data Dependencies
International audienceDefining access control policies in a data integration scenario is a challenging task. In such a scenario typically each source specifies its local access control policy and cannot anticipate data inferences that can arise when data is integrated at the mediator level. Inferences, e.g., using functional dependencies, can allow malicious users to obtain, at the mediator level, prohibited information by linking multiple queries and thus violating the local policies. In this paper, we propose a framework, i.e., a methodology and a set of algorithms, to prevent such violations. First, we use a graph-based approach to identify sets of queries, called violating transactions, and then we propose an approach to forbid the execution of those transactions by identifying additional access control rules that should be added to the mediator. We also state the complexity of the algorithms and discuss a set of experiments we conducted by using both real and synthetic datasets. Tests also confirm the complexity and upper bounds in worst-case scenarios of the proposed algorithms
Preventing Inferences through Data Dependencies on Sensitive Data
Simply restricting the computation to non-sensitive part of the data may lead to inferences on sensitive data through data dependencies. Inference control from data dependencies has been studied in the prior work. However, existing solutions either detect and deny queries which may lead to leakage – resulting in poor utility, or only protects against exact reconstruction of the sensitive data – resulting in poor security. In this paper, we present a novel security model called full deniability. Under this stronger security model, any information inferred about sensitive data from non-sensitive data is considered as a leakage. We describe algorithms for efficiently implementing full deniability on a given database instance with a set of data dependencies and sensitive cells. Using experiments on two different datasets, we demonstrate that our approach protects against realistic adversaries while hiding only minimal number of additional non-sensitive cells and scales well with database size and sensitive data
- …