43,380 research outputs found

    Beyond Classical Statistics: Optimality In Transfer Learning And Distributed Learning

    Get PDF
    During modern statistical learning practice, statisticians are dealing with increasingly huge, complicated and structured data sets. New opportunities can be found during the learning process with better structured data sets as well as powerful data analytic resources. Also, there are more and more challenges we need to address when dealing with large data sets, due to limitation of computation, communication resources or privacy concerns. Under decision-theoretical framework, statistical optimality should be reconsidered with new type of data or new constraints. Under the framework of minimax theory, this thesis aims to address the following four problems:1. The first part of this thesis aims to develop an optimality theory for transfer learning for nonparametric classification. An near optimal adaptive classifier is also established. 2. In the second part, we study distributed Gaussian mean estimation with known vari- ance under communication constraints. The exact distributed minimax rate of con- vergence is derived under three different communication protocols. 3. In the third part, we study distributed Gaussian mean estimation with unknown vari- ance under communication constraints. The results show that the amount of additional communication cost depends on the type of underlying communication protocol. 4. In the fourth part, we investigate the minimax optimality and communication cost of adaptation for distributed nonparametric function estimation under communication constraints

    Privacy-Preserving Outsourcing of Large-Scale Nonlinear Programming to the Cloud

    Full text link
    The increasing massive data generated by various sources has given birth to big data analytics. Solving large-scale nonlinear programming problems (NLPs) is one important big data analytics task that has applications in many domains such as transport and logistics. However, NLPs are usually too computationally expensive for resource-constrained users. Fortunately, cloud computing provides an alternative and economical service for resource-constrained users to outsource their computation tasks to the cloud. However, one major concern with outsourcing NLPs is the leakage of user's private information contained in NLP formulations and results. Although much work has been done on privacy-preserving outsourcing of computation tasks, little attention has been paid to NLPs. In this paper, we for the first time investigate secure outsourcing of general large-scale NLPs with nonlinear constraints. A secure and efficient transformation scheme at the user side is proposed to protect user's private information; at the cloud side, generalized reduced gradient method is applied to effectively solve the transformed large-scale NLPs. The proposed protocol is implemented on a cloud computing testbed. Experimental evaluations demonstrate that significant time can be saved for users and the proposed mechanism has the potential for practical use.Comment: Ang Li and Wei Du equally contributed to this work. This work was done when Wei Du was at the University of Arkansas. 2018 EAI International Conference on Security and Privacy in Communication Networks (SecureComm

    Privacy-Preserving Distributed Optimization via Subspace Perturbation: A General Framework

    Get PDF
    As the modern world becomes increasingly digitized and interconnected, distributed signal processing has proven to be effective in processing its large volume of data. However, a main challenge limiting the broad use of distributed signal processing techniques is the issue of privacy in handling sensitive data. To address this privacy issue, we propose a novel yet general subspace perturbation method for privacy-preserving distributed optimization, which allows each node to obtain the desired solution while protecting its private data. In particular, we show that the dual variables introduced in each distributed optimizer will not converge in a certain subspace determined by the graph topology. Additionally, the optimization variable is ensured to converge to the desired solution, because it is orthogonal to this non-convergent subspace. We therefore propose to insert noise in the non-convergent subspace through the dual variable such that the private data are protected, and the accuracy of the desired solution is completely unaffected. Moreover, the proposed method is shown to be secure under two widely-used adversary models: passive and eavesdropping. Furthermore, we consider several distributed optimizers such as ADMM and PDMM to demonstrate the general applicability of the proposed method. Finally, we test the performance through a set of applications. Numerical tests indicate that the proposed method is superior to existing methods in terms of several parameters like estimated accuracy, privacy level, communication cost and convergence rate

    Privacy preserving distributed optimization using homomorphic encryption

    Full text link
    This paper studies how a system operator and a set of agents securely execute a distributed projected gradient-based algorithm. In particular, each participant holds a set of problem coefficients and/or states whose values are private to the data owner. The concerned problem raises two questions: how to securely compute given functions; and which functions should be computed in the first place. For the first question, by using the techniques of homomorphic encryption, we propose novel algorithms which can achieve secure multiparty computation with perfect correctness. For the second question, we identify a class of functions which can be securely computed. The correctness and computational efficiency of the proposed algorithms are verified by two case studies of power systems, one on a demand response problem and the other on an optimal power flow problem.Comment: 24 pages, 5 figures, journa

    Scather: programming with multi-party computation and MapReduce

    Full text link
    We present a prototype of a distributed computational infrastructure, an associated high level programming language, and an underlying formal framework that allow multiple parties to leverage their own cloud-based computational resources (capable of supporting MapReduce [27] operations) in concert with multi-party computation (MPC) to execute statistical analysis algorithms that have privacy-preserving properties. Our architecture allows a data analyst unfamiliar with MPC to: (1) author an analysis algorithm that is agnostic with regard to data privacy policies, (2) to use an automated process to derive algorithm implementation variants that have different privacy and performance properties, and (3) to compile those implementation variants so that they can be deployed on an infrastructures that allows computations to take place locally within each participant’s MapReduce cluster as well as across all the participants’ clusters using an MPC protocol. We describe implementation details of the architecture, discuss and demonstrate how the formal framework enables the exploration of tradeoffs between the efficiency and privacy properties of an analysis algorithm, and present two example applications that illustrate how such an infrastructure can be utilized in practice.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798
    • …
    corecore