9,201 research outputs found
PrivLava: Synthesizing Relational Data with Foreign Keys under Differential Privacy
Answering database queries while preserving privacy is an important problem
that has attracted considerable research attention in recent years. A canonical
approach to this problem is to use synthetic data. That is, we replace the
input database R with a synthetic database R* that preserves the
characteristics of R, and use R* to answer queries. Existing solutions for
relational data synthesis, however, either fail to provide strong privacy
protection, or assume that R contains a single relation. In addition, it is
challenging to extend the existing single-relation solutions to the case of
multiple relations, because they are unable to model the complex correlations
induced by the foreign keys. Therefore, multi-relational data synthesis with
strong privacy guarantees is an open problem. In this paper, we address the
above open problem by proposing PrivLava, the first solution for synthesizing
relational data with foreign keys under differential privacy, a rigorous
privacy framework widely adopted in both academia and industry. The key idea of
PrivLava is to model the data distribution in R using graphical models, with
latent variables included to capture the inter-relational correlations caused
by foreign keys. We show that PrivLava supports arbitrary foreign key
references that form a directed acyclic graph, and is able to tackle the common
case when R contains a mixture of public and private relations. Extensive
experiments on census data sets and the TPC-H benchmark demonstrate that
PrivLava significantly outperforms its competitors in terms of the accuracy of
aggregate queries processed on the synthetic data.Comment: This is an extended version of a SIGMOD 2023 pape
Subsidiary Entrepreneurial Alertness: Antecedents and Outcomes
This thesis brings together concepts from both international business and entrepreneurship to develop a framework of the facilitators of subsidiary innovation and performance. This study proposes that Subsidiary Entrepreneurial Alertness (SEA) facilitates the recognition of opportunities (the origin of subsidiary initiatives). First introduced by Kirzner (1979) in the context of the individual, entrepreneurial alertness (EA) is the ability to notice an opportunity without actively searching. Similarly, to entrepreneurial alertness at the individual level, this study argues that SEA enables the subsidiary to best select opportunities based on resources available. The research further develops our conceptualisation of SEA by drawing on work by Tang et al. (2012) identifying three distinct activities of EA: scanning and search (identifying opportunities unseen by others due to their awareness gaps), association and connection of information, and evaluation and judgement to interpret or anticipate future viability of opportunities. This study then hypothesises that SEA leads to opportunity recognition at the subsidiary level and further hypothesises innovation and performance as outcomes of opportunity recognition. This research brings these arguments together to develop and test a comprehensive theoretical model.
The theoretical model is tested through a mail survey of the CEOs/MDs of foreign subsidiaries within the Republic of Ireland (an innovative hub for foreign subsidiaries). This method was selected as the best method to reach the targeted respondent, and due to the depth of knowledge the target respondent holds, the survey can answer the desired question more substantially. The results were examined using partial least squares structural equation modelling (PLS-SEM). The study’s findings confirm two critical aspects of subsidiary context, subsidiary brokerage and subsidiary credibility are positively related to SEA. The study establishes a positive link between SEA and both the generation of innovation and the subsidiary’s performance. This thesis makes three significant contributions to the subsidiary literature as it 1) introduces and develops the concept of SEA, 2) identifies the antecedents of SEA, and 3) demonstrates the impact of SEA on subsidiary opportunity recognition. Implications for subsidiaries, headquarters and policy makers are discussed along with the limitations of the study
Bayesian Optimization with Conformal Prediction Sets
Bayesian optimization is a coherent, ubiquitous approach to decision-making
under uncertainty, with applications including multi-arm bandits, active
learning, and black-box optimization. Bayesian optimization selects decisions
(i.e. objective function queries) with maximal expected utility with respect to
the posterior distribution of a Bayesian model, which quantifies reducible,
epistemic uncertainty about query outcomes. In practice, subjectively
implausible outcomes can occur regularly for two reasons: 1) model
misspecification and 2) covariate shift. Conformal prediction is an uncertainty
quantification method with coverage guarantees even for misspecified models and
a simple mechanism to correct for covariate shift. We propose conformal
Bayesian optimization, which directs queries towards regions of search space
where the model predictions have guaranteed validity, and investigate its
behavior on a suite of black-box optimization tasks and tabular ranking tasks.
In many cases we find that query coverage can be significantly improved without
harming sample-efficiency.Comment: For code, see
https://www.github.com/samuelstanton/conformal-bayesopt.gi
The Adirondack Chronology
The Adirondack Chronology is intended to be a useful resource for researchers and others interested in the Adirondacks and Adirondack history.https://digitalworks.union.edu/arlpublications/1000/thumbnail.jp
Recommended from our members
Mixture Models in Machine Learning
Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings.
In this thesis, we look at three groups of problems. The first part is aimed at estimating the parameters of a mixture of simple distributions. We ask the following question: How many samples are necessary and sufficient to learn the latent parameters? We propose several approaches for this problem that include complex analytic tools to connect statistical distances between pairs of mixtures with the characteristic function. We show sufficient sample complexity guarantees for mixtures of popular distributions (including Gaussian, Poisson and Geometric). For many distributions, our results provide the first sample complexity guarantees for parameter estimation in the corresponding mixture. Using these techniques, we also provide improved lower bounds on the Total Variation distance between Gaussian mixtures with two components and demonstrate new results in some sequence reconstruction problems.
In the second part, we study Mixtures of Sparse Linear Regressions where the goal is to learn the best set of linear relationships between the scalar responses (i.e., labels) and the explanatory variables (i.e., features). We focus on a scenario where a learner is able to choose the features to get the labels. To tackle the high dimensionality of data, we further assume that the linear maps are also sparse , i.e., have only few prominent features among many. For this setting, we devise algorithms with sub-linear (as a function of the dimension) sample complexity guarantees that are also robust to noise.
In the final part, we study Mixtures of Sparse Linear Classifiers in the same setting as above. Given a set of features and the binary labels, the objective of this task is to find a set of hyperplanes in the space of features such that for any (feature, label) pair, there exists a hyperplane in the set that justifies the mapping. We devise efficient algorithms with sub-linear sample complexity guarantees for learning the unknown hyperplanes under similar sparsity assumptions as above. To that end, we propose several novel techniques that include tensor decomposition methods and combinatorial designs
Recommended from our members
After Creation: Intergovernmental Organizations and Member State Governments as Co-Participants in an Authority Relationship
This is a re-amalgamation of what started as one manuscript and became two when the length proved to be more than any publisher wanted to consider. The splitting consisted of removing what are now Parts 3, 4, and 5 so that the manuscript focused on the outcome-related shared beliefs holding an authority relationship together. Those parts were last worked on in 2018. The rest were last worked on in late 2021 but also remain incomplete.
The relational approach adopted in this study treats intergovernmental organizations and the governments of member states as co-participants in an authority relationship with the governments of their member states. Authority relationships link two types of actor, defined by their authority-holder or addressee role in the relationship, through a set of shared beliefs about why the relationship exists and how the participants should fulfill their respective roles. The IGO as authority holder has a role that includes a right to instruct other actors about what they should or should not do; the governments of member states as addressees are expected to comply with the instructions. Three sets of shared beliefs provide the conceptual “glue” holding the relationship together. The first defines the goal of the collective effort, providing both the rationale for having the authority relationship and providing a lode star for assessments of the collective effort’s success or lack of success. The second set defines the shared understanding about allocation of roles and the process of interaction by establishing shared expectations about a) the selection process by which particular actors acquire authority holder roles, b) the definitions identifying one or more categories of addressees expected to follow instructions, and c) the procedures through which the authority holder issues instructions. The third set focus on the outcomes of cooperation through the relationship by defining a) the substantive areas in which the authority holder may issue instructions, b) the bases for assessing the relevance actions mandated in instructions for reaching the goal, and c) the relative efficacy of action paths chosen for reaching the goal as compared to other possible action paths.
Using an authority relationship framework for analyzing cooperation through IGOs highlights the inherently bi-directional nature of IGO-member government activity by viewing their interaction as involving a three-step process in which the IGO as authority holder decides when to issue what instruction, the member state governments as followers react to the instruction with anything from prompt and full compliance through various forms of pushback to outright rejection, and the IGO as authority holder responds to how the followers react with efforts to increase individual compliance with instructions and reinforce continuing acceptance of the authority relationship. Foregrounding the dynamics produced by the interaction of these two streams of perception and action reveals more clearly how far intergovernmental organizations acquire capacity to operate as independent actors, the dynamic ways they maintain that capacity, and how much they influence member governments’ beliefs and actions at different times. The approach fosters better understanding of why, when, and for how long governments choose cooperation through an IGO even in periods of rising unilateralism
Recommended from our members
Reliable Decision-Making with Imprecise Models
The rapid growth in the deployment of autonomous systems across various sectors has generated considerable interest in how these systems can operate reliably in large, stochastic, and unstructured environments. Despite recent advances in artificial intelligence and machine learning, it is challenging to assure that autonomous systems will operate reliably in the open world. One of the causes of unreliable behavior is the impreciseness of the model used for decision-making. Due to the practical challenges in data collection and precise model specification, autonomous systems often operate based on models that do not represent all the details in the environment. Even if the system has access to a comprehensive decision-making model that accounts for all the details in the environment and all possible scenarios the agent may encounter, it may be intractable to solve this complex model optimally. Consequently, this complex, high fidelity model may be simplified to accelerate planning, introducing imprecision. Reasoning with such imprecise models affects the reliability of autonomous systems. A system\u27s actions may sometimes produce unexpected, undesirable consequences, which are often identified after deployment. How can we design autonomous systems that can operate reliably in the presence of uncertainty and model imprecision?
This dissertation presents solutions to address three classes of model imprecision in a Markov decision process, along with an analysis of the conditions under which bounded-performance can be guaranteed. First, an adaptive outcome selection approach is introduced to devise risk-aware reduced models of the environment that efficiently balance the trade-off between model simplicity and fidelity, to accelerate planning in resource-constrained settings. Second, a framework that extends stochastic shortest path framework to problems with imperfect information about the goal state during planning is introduced, along with two solution approaches to solve this problem. Finally, two complementary solution approaches are presented to minimize the negative side effects of agent actions. The techniques presented in this dissertation enable an autonomous system to detect and mitigate undesirable behavior, without redesigning the model entirely
- …