32 research outputs found

    The Capacity of Private Information Retrieval from Decentralized Uncoded Caching Databases

    Full text link
    We consider the private information retrieval (PIR) problem from decentralized uncoded caching databases. There are two phases in our problem setting, a caching phase, and a retrieval phase. In the caching phase, a data center containing all the KK files, where each file is of size LL bits, and several databases with storage size constraint ΞΌKL\mu K L bits exist in the system. Each database independently chooses ΞΌKL\mu K L bits out of the total KLKL bits from the data center to cache through the same probability distribution in a decentralized manner. In the retrieval phase, a user (retriever) accesses NN databases in addition to the data center, and wishes to retrieve a desired file privately. We characterize the optimal normalized download cost to be DL=βˆ‘n=1N+1(Nnβˆ’1)ΞΌnβˆ’1(1βˆ’ΞΌ)N+1βˆ’n(1+1n+β‹―+1nKβˆ’1)\frac{D}{L} = \sum_{n=1}^{N+1} \binom{N}{n-1} \mu^{n-1} (1-\mu)^{N+1-n} \left( 1+ \frac{1}{n} + \dots+ \frac{1}{n^{K-1}} \right). We show that uniform and random caching scheme which is originally proposed for decentralized coded caching by Maddah-Ali and Niesen, along with Sun and Jafar retrieval scheme which is originally proposed for PIR from replicated databases surprisingly result in the lowest normalized download cost. This is the decentralized counterpart of the recent result of Attia, Kumar and Tandon for the centralized case. The converse proof contains several ingredients such as interference lower bound, induction lemma, replacing queries and answering string random variables with the content of distributed databases, the nature of decentralized uncoded caching databases, and bit marginalization of joint caching distributions.Comment: Submitted for publication, November 201

    λΆ„μ‚° μ»΄ν“¨νŒ…κ³Ό μΊμ‹œλ₯Ό μ ‘λͺ©ν•œ 정보 κ²€μƒ‰μ—μ„œμ˜ λ³΄μ•ˆ 및 ν”„λΌμ΄λ²„μ‹œ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀,2020. 2. μ΄μ •μš°.λ§Žμ€ μ–‘μ˜ 데이터 μ €μž₯μ΄λ‚˜ 데이터 계산을 μœ„ν•΄μ„œλŠ” λΆ„μ‚° μ‹œμŠ€ν…œμ΄ ν•„μˆ˜μ μ΄λ‹€. μ΄λŸ¬ν•œ λΆ„μ‚° μ‹œμŠ€ν…œμ˜ 데이터 μ €μž₯κ³Ό κ³„μ‚°μ˜ 효율의 λ†’μ΄λŠ” 반면, λ°μ΄ν„°μ˜ λ³΄μ•ˆκ³Ό ν”„λΌμ΄λ²„μ‹œμ— λŒ€ν•œ μœ„ν—˜λ„ μ¦κ°€μ‹œν‚¨λ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” 데이터 μ €μž₯κ³Ό 데이터 계산을 μœ„ν•œ λΆ„μ‚° μ‹œμŠ€ν…œμ—μ„œ 데이터에 λŒ€ν•œ λ³΄μ•ˆκ³Ό ν”„λΌμ΄λ²„μ‹œλ₯Ό κ³ λ €ν•œλ‹€. 특히, μ΄λŸ¬ν•œ μ‹œμŠ€ν…œμ— λŒ€ν•˜μ—¬ λ³΄μ•ˆκ³Ό ν”„λΌμ΄λ²„μ‹œλ₯Ό 보μž₯ν•˜λŠ” λΆ€ν˜Έν™” 기법을 μ œμ•ˆν•œλ‹€. μš°μ„ , μœ μ €κ°€ 사전에 μΊμ‹œμ— μΌμ •λŸ‰μ˜ 데이터λ₯Ό μ €μž₯ν•˜κ³  μžˆλŠ” cache-aided PIR을 μ œμ•ˆν•œλ‹€. μ œμ•ˆν•˜λŠ” 기법은 κΈ°μ‘΄ PIR 문제의 졜적 기법을 기반으둜 ν•œλ‹€. μ œμ•ˆν•˜λŠ” κΈ°λ²•μ—μ„œ, μΊμ‹œμ— μ €μž₯된 λ°μ΄ν„°λŠ” λΆ€κ°€μ •λ³΄λ‘œ 이용되며, μ΄λŠ” μΊμ‹œκ°€ 없을 λ•Œ λŒ€λΉ„ λ‹€μš΄λ‘œλ“œμ–‘μ˜ κ°μ†Œλ‘œ 이어진닀. 두 번째둜, λΆ€ν˜Έν™”λœ λΆ„μ‚° μ»΄ν“¨νŒ… μ‹œμŠ€ν…œμ—μ„œ λ§ˆμŠ€ν„°μ˜ ν”„λΌμ΄λ²„μ‹œλ₯Ό κ³ λ €ν•œλ‹€. 이 μ‹œμŠ€ν…œμ—μ„œ μ›Œμ»€λ“€κ³Ό λ§ˆμŠ€ν„°λŠ” 각각 κ³ μœ ν•œ 데이터λ₯Ό 가지며, μ›Œμ»€λ“€μ˜ λ°μ΄ν„°λŠ” 라이브러리 ν˜•νƒœλ‘œ 이루어진닀. λ§ˆμŠ€ν„°λŠ” μžμ‹ μ˜ 데이터와 데이터 라이브러리 λ‚΄ νŠΉμ • λ°μ΄ν„°μ˜ ν•¨μˆ˜λ₯Ό 계산해야 ν•œλ‹€. 이 λ•Œ λ§ˆμŠ€ν„°μ˜ ν”„λΌμ΄λ²„μ‹œλŠ” μ›Œμ»€λ“€μ΄ λ§ˆμŠ€ν„°κ°€ 라이브러리 μ•ˆμ˜ μ–΄λ–€ 데이터λ₯Ό μ›ν•˜λŠ”μ§€ λͺ¨λ₯΄λŠ” 것을 λœ»ν•œλ‹€. μ΄λŸ¬ν•œ μ‹œμŠ€ν…œμ„ private coded computation이라 ν•˜λ©°, μ œμ•ˆν•˜λŠ” 기법을 private polynomial codes라 ν•œλ‹€. μ œμ•ˆν•˜λŠ” κΈ°λ²•μ—μ„œλŠ” 기쑴의 polynomial codesμ—μ„œλŠ” κ³ λ €λ˜μ§€ μ•Šμ•˜λ˜ 비동기적 기법이 λ„μž…λœλ‹€. 이둜 μΈν•˜μ—¬ μ œμ•ˆν•˜λŠ” 기법은 λ³€ν˜•λœ 졜적의 RPIR κΈ°λ²•λŒ€λΉ„ 더 λΉ λ₯Έ κ³„μ‚°μ‹œκ°„μ„ λ‹¬μ„±ν•œλ‹€. 끝으둜, λΆ€ν˜Έν™”λœ λΆ„μ‚° μ»΄ν“¨νŒ… μ‹œμŠ€ν…œμ—μ„œ λ§ˆμŠ€ν„°μ˜ ν”„λΌμ΄λ²„μ‹œμ™€ 데이터 λ³΄μ•ˆμ„ λ™μ‹œμ— κ³ λ €ν•œλ‹€. 데이터 λ³΄μ•ˆμ€ λ§ˆμŠ€ν„°μ˜ κ³ μœ ν•œ 데이터λ₯Ό μ›Œμ»€λ“€λ‘œλΆ€ν„° λ³΄ν˜Έν•˜λŠ” 것을 μ˜λ―Έν•œλ‹€. μ΄λŸ¬ν•œ μ‹œμŠ€ν…œμ„ private secure coded computation이라 ν•˜λ©°, μ œμ•ˆν•˜λŠ” 기법을 private secure polynomial codes라 ν•œλ‹€. Private polynomial codesλ₯Ό λ³€ν˜•ν•˜μ—¬ private secure polynomial codes와 private polynomial codesλ₯Ό κ³„μ‚°μ‹œκ°„κ³Ό ν†΅μ‹ λŸ‰ μΈ‘λ©΄μ—μ„œ λΉ„κ΅ν•œλ‹€. κ·Έ κ²°κ³Ό, 같은 μ–‘μ˜ ν†΅μ‹ λŸ‰μ— λŒ€ν•˜μ—¬, private secure polynomial codesκ°€ 더 λΉ λ₯Έ 계산 μ‹œκ°„μ„ λ‹¬μ„±ν•œλ‹€.As a major format of data changes from the text to the videos, the amount of memory for storing data increases exponentially, as well as the amount of computation for handling the data. As a result, to alleviate these burdens of storage and computations, the distributed systems are actively studied. Meanwhile, since low latency is one of the main objectives in 5G communications, recent techniques such as edge computing or federated learning in machine learning become important. Since the decentralized systems are fundamental characteristics of these techniques, the distributed systems which include the decentralized systems also become important. In this dissertation, I consider the distributed systems for storage and computation. For the data storage, large-scale data centers collectively store a library of files where the size of each file is tremendous. When a user needs a specific file, it can be downloaded from distributed data centers. In this system, minimizing the amount of download is a significant concern. The user's privacy in this system implies that the user should conceal the index of its desired file against the databases. This kind of problem is referred to as private information retrieval (PIR) problem. The goal of PIR problem is to minimize the amount of download from the databases while ensuring the user's privacy. Meanwhile, for a large amount of computation, the user can divide the whole computation into sub-computations and distribute them to external workers who constitute a distributed system. There can be three cases for the computation. Firstly, the user may own all of the data to be computed and sends both of its data and instructions for the computation to the workers. Secondly, the workers collectively own all of the data and the user just sends instructions for the data selection and computation to the workers. Thirdly, the user and the workers have their own data respectively and the user sends the data and instructions for the data selection and computation to the workers. Since some of the workers can be slow for various reasons, the user may use a coding technique, e.g., an erasure code, to avoid the delaying effect caused by the slow workers. This kind of technique is referred to as coded computation. In these systems, speeding up the computation process is a significant concern. In this dissertation, I focus on the third system. In the considered system, the privacy is similar to that of distributed systems for storage. On the other hand, the security implies that the user should conceal the content of its own data against the workers so that the workers do not have any information about the user's own data. In this dissertation, I consider the user's privacy in distributed systems for storage, and both of the privacy and security in distributed systems for the computation. In case of the distributed systems for storage, since the user does not have its own data, the data security on the user's data cannot be considered. Particularly, I propose some achievable schemes that ensure the privacy and security in these systems. To begin with, as a new variation of PIR problem, I consider a user's cache that has some pre-stored data of databases' library. I refer to this problem as cache-aided PIR problem. By introducing the user's cache in the PIR problem, the amount of download from the databases is significantly reduced. The achievable scheme is based on the optimal scheme for conventional PIR problem. In the achievable scheme, the pre-store cache was exploited as an side information, which reduces the amount of download, compared to the PIR problem without cache. Secondly, I consider the master's privacy in coded computation. In the system model, the workers have their own data, as well as the master. The workers' data constitutes a library of several files. The master should compute a function of its own data and a specific file in the library. The master's privacy implies that the workers' should not know which file in the library is desired by the user. I refer to this problem as private coded computation and propose an achievable scheme of private coded computation, namely private polynomial codes. The private polynomial codes are based on the polynomial codes which were proposed in the conventional coded computation system. In the achievable scheme, the workers are grouped for the privacy and asynchronous scheme is considered, which was not considered in the conventional polynomial codes. Due to the asynchronous scheme, the proposed scheme achieves the faster computation time, compared to the modified optimal RPIR scheme. Lastly, I consider the data security in coded computation, as well as the master's privacy. The system model is similar to that of private coded computation. The data security implies that the master should protect its own data against the workers. I refer to this problem as private secure coded computation and propose an achievable scheme, namely private secure polynomial codes. The private secure polynomial codes are based on the polynomial codes which were proposed in the conventional coded computation system. By modifying the private polynomial codes, the private secure polynomial codes and private secure polynomial codes are compared in terms of computation time and communication load. As a result, the private secure polynomial codes achieves faster computation time for the same communication load.1. Introduction 1 1.1 Related work 3 1.1.1 Private information retrieval 3 1.1.2 Coded computation 4 1.2 Contributions and Organization 5 2. Cache-aided Private Information Retrieval 8 2.1 Introduction 8 2.2 System model 9 2.3 Main results : 12 2.4 Achievable scheme 17 2.4.1 Cacheless phase 17 2.4.2 Cache-assisted phase 21 2.4.3 Cache-aided PIR 24 2.5 Tightness of achievable scheme 29 2.6 Conclusions and follow-up works 30 3. Private Coded Computation 32 3.1 Introduction 32 3.2 System model 37 3.3 Main results 41 3.4 Private polynomial codes 42 3.4.1 First example 42 3.4.2 Second example 48 3.4.3 General description 52 3.4.4 Privacy proof 56 3.4.5 Performance analysis 59 3.4.6 Special cases 61 3.5 Simulation results 62 3.5.1 Computation time 62 3.5.2 Communication load 68 3.6 Conclusion 69 4. Private Secure Coded Computation 71 4.1 Introduction 71 4.2 Main results 75 4.3 Private secure polynomial codes 76 4.3.1 Illustrative example 76 4.3.2 General description 80 4.3.3 Performance analysis 83 4.3.4 Privacy and security proof 84 4.4 Simulation results 85 4.4.1 Computation time 86 4.4.2 Communication load 90 4.5 Conclusion 91 5 Conclusion 93 5.1 Summary 93 5.2 Future directions 94 ꡭ문초둝 105 Acknowledgement 107Docto

    Information-Theoretically Private Federated Submodel Learning with Storage Constrained Databases

    Full text link
    In federated submodel learning (FSL), a machine learning model is divided into multiple submodels based on different types of data used for training. Each user involved in the training process only downloads and updates the submodel relevant to the user's local data, which significantly reduces the communication cost compared to classical federated learning (FL). However, the index of the submodel updated by the user and the values of the updates reveal information about the user's private data. In order to guarantee information-theoretic privacy in FSL, the model is stored at multiple non-colluding databases, and the user sends queries and updates to each database in such a way that no information is revealed on the updating submodel index or the values of the updates. In this work, we consider the practical scenario where the multiple non-colluding databases are allowed to have arbitrary storage constraints. The goal of this work is to develop read-write schemes and storage mechanisms for FSL that efficiently utilize the available storage in each database to store the submodel parameters in such a way that the total communication cost is minimized while guaranteeing information-theoretic privacy of the updating submodel index and the values of the updates. As the main result, we consider both heterogeneous and homogeneous storage constrained databases, and propose private read-write and storage schemes for the two cases.Comment: arXiv admin note: text overlap with arXiv:2302.0367

    Quantum Symmetric Private Information Retrieval with Secure Storage and Eavesdroppers

    Full text link
    We consider both the classical and quantum variations of XX-secure, EE-eavesdropped and TT-colluding symmetric private information retrieval (SPIR). This is the first work to study SPIR with XX-security in classical or quantum variations. We first develop a scheme for classical XX-secure, EE-eavesdropped and TT-colluding SPIR (XSETSPIR) based on a modified version of cross subspace alignment (CSA), which achieves a rate of R=1βˆ’X+max⁑(T,E)NR= 1 - \frac{X+\max(T,E)}{N}. The modified scheme achieves the same rate as the scheme used for XX-secure PIR with the extra benefit of symmetric privacy. Next, we extend this scheme to its quantum counterpart based on the NN-sum box abstraction. This is the first work to consider the presence of eavesdroppers in quantum private information retrieval (QPIR). In the quantum variation, the eavesdroppers have better access to information over the quantum channel compared to the classical channel due to the over-the-air decodability. To that end, we develop another scheme specialized to combat eavesdroppers over quantum channels. The scheme proposed for XX-secure, EE-eavesdropped and TT-colluding quantum SPIR (XSETQSPIR) in this work maintains the super-dense coding gain from the shared entanglement between the databases, i.e., achieves a rate of RQ=min⁑{1,2(1βˆ’X+max⁑(T,E)N)}R_Q = \min\left\{ 1, 2\left(1-\frac{X+\max(T,E)}{N}\right)\right\}
    corecore