9 research outputs found

    Coding for Privacy in Distributed Computing

    Get PDF
    I et distribuert datanettverk samarbeider flere enheter for å løse et problem. Slik kan vi oppnå mer enn summen av delene: samarbeid gjør at problemet kan løses mer effektivt, og samtidig blir det mulig å løse problemer som hver enkelt enhet ikke kan løse på egen hånd. På den annen side kan enheter som bruker veldig lang tid på å fullføre sin oppgave øke den totale beregningstiden betydelig. Denne såkalte straggler-effekten kan oppstå som følge av tilfeldige hendelser som minnetilgang og oppgaver som kjører i bakgrunnen på de ulike enhetene. Straggler-problemet blokkerer vanligvis hele beregningen siden alle enhetene må vente på at de treigeste enhetene blir ferdige. Videre kan deling av data og delberegninger mellom de ulike enhetene belaste kommunikasjonsnettverket betydelig. Spesielt i et trådløst nettverk hvor enhetene må dele en enkelt kommunikasjonskanal, for eksempel ved beregninger langs kanten av et nettverk (såkalte kantberegninger) og ved føderert læring, blir kommunikasjonen ofte flaskehalsen. Sist men ikke minst gir deling av data med upålitelige enheter økt bekymring for personvernet. En som ønsker å bruke et distribuert datanettverk kan være skeptisk til å dele personlige data med andre enheter uten å beskytte sensitiv informasjon tilstrekkelig. Denne avhandlingen studerer hvordan ideer fra kodeteori kan dempe straggler-problemet, øke effektiviteten til kommunikasjonen og garantere datavern i distribuert databehandling. Spesielt gir del A en innføring i kantberegning og føderert læring, to populære instanser av distribuert databehandling, lineær regresjon, et vanlig problem som kan løses ved distribuert databehandling, og relevante ideer fra kodeteori. Del B består av forskningsartikler skrevet innenfor rammen av denne avhandlingen. Artiklene presenterer metoder som utnytter ideer fra kodeteori for å redusere beregningstiden samtidig som datavernet ivaretas ved kantberegninger og ved føderert læring. De foreslåtte metodene gir betydelige forbedringer sammenlignet med tidligere metoder i litteraturen. For eksempel oppnår en metode fra artikkel I en 8%-hastighetsforbedring for kantberegninger sammenlignet med en nylig foreslått metode. Samtidig ivaretar vår metode datavernet, mens den metoden som vi sammenligner med ikke gjør det. Artikkel II presenterer en metode som for noen brukstilfeller er opp til 18 ganger raskere for føderert læring sammenlignet med tidligere metoder i litteraturen.In a distributed computing network, multiple devices combine their resources to solve a problem. Thereby the network can achieve more than the sum of its parts: cooperation of the devices can enable the devices to compute more efficiently than each device on its own could and even enable the devices to solve a problem neither of them could solve on its own. However, devices taking exceptionally long to finish their tasks can exacerbate the overall latency of the computation. This so-called straggler effect can arise from random effects such as memory access and tasks running in the background of the devices. The effect typically stalls the whole network because most devices must wait for the stragglers to finish. Furthermore, sharing data and results among devices can severely strain the communication network. Especially in a wireless network where devices have to share a common channel, e.g., in edge computing and federated learning, the communication links often become the bottleneck. Last but not least, offloading data to untrusted devices raises privacy concerns. A participant in the distributed computing network might be weary of sharing personal data with other devices without adequately protecting sensitive information. This thesis analyses how ideas from coding theory can mitigate the straggler effect, reduce the communication load, and guarantee data privacy in distributed computing. In particular, Part A gives background on edge computing and federated learning, two popular instances of distributed computing, linear regression, a common problem to be solved by distributed computing, and the specific ideas from coding theory that are proposed to tackle the problems arising in distributed computing. Part B contains papers on the research performed in the framework of this thesis. The papers propose schemes that combine the introduced coding theory ideas to minimize the overall latency while preserving data privacy in edge computing and federated learning. The proposed schemes significantly outperform state-of-the-art schemes. For example, a scheme from Paper I achieves an 8% speed-up for edge computing compared to a recently proposed non-private scheme while guaranteeing data privacy, whereas the schemes from Paper II achieve a speed-up factor of up to 18 for federated learning compared to current schemes in the literature for considered scenarios.Doktorgradsavhandlin

    Coding for Straggler Mitigation in Federated Learning

    Get PDF
    We present a novel coded federated learning (FL) scheme for linear regression that mitigates the effect of straggling devices while retaining the privacy level of conventional FL. The proposed scheme combines one-time padding to preserve privacy and gradient codes to yield resiliency against stragglers and consists of two phases. In the first phase, the devices share a one-time padded version of their local data with a subset of other devices. In the second phase, the devices and the central server collaboratively and iteratively train a global linear model using gradient codes on the one-time padded local data. To apply one-time padding to real data, our scheme exploits a fixed-point arithmetic representation of the data. Unlike the coded FL scheme recently introduced by Prakash et al., the proposed scheme maintains the same level of privacy as conventional FL while achieving a similar training time. Compared to conventional FL, we show that the proposed scheme achieves a training speed-up factor of 6.6 and 9.2 on the MNIST and Fashion-MNIST datasets for an accuracy of 95% and 85%, respectively

    Private Edge Computing for Linear Inference Based on Secret Sharing

    Full text link
    We consider an edge computing scenario where users want to perform a linear computation on local, private data and a network-wide, public matrix. The users offload computations to edge servers located at the edge of the network, but do not want the servers, or any other party with access to the wireless links, to gain any information about their data. We provide a scheme that guarantees information-theoretic user data privacy against an eavesdropper with access to a number of edge servers or their corresponding communication links. The proposed scheme utilizes secret sharing and partial replication to provide privacy, mitigate the effect of straggling servers, and to allow for joint beamforming opportunities in the download phase, in order to minimize the overall latency, consisting of upload, computation, and download latencies.Comment: 6 pages, 4 figures, submitted to the 2020 IEEE Global Communications Conference (IEEE GLOBECOM

    CodedPaddedFL and CodedSecAgg: Straggler Mitigation and Secure Aggregation in Federated Learning

    No full text
    We present two novel federated learning (FL) schemes that mitigate the effect of straggling devices by introducing redundancy on the devices\u27 data across the network. Compared to other schemes in the literature, which deal with stragglers or device dropouts by ignoring their contribution, the proposed schemes do not suffer from the client drift problem. The first scheme, CodedPaddedFL, mitigates the effect of stragglers while retaining the privacy level of conventional FL. It combines one-time padding for user data privacy with gradient codes to yield straggler resiliency. The second scheme, CodedSecAgg, provides straggler resiliency and robustness against model inversion attacks and is based on Shamir\u27s secret sharing. We apply CodedPaddedFL and CodedSecAgg to a classification problem. For a scenario with 120 devices, CodedPaddedFL achieves a speed-up factor of 18 for an accuracy of 95% on the MNIST dataset compared to conventional FL. Furthermore, it yields similar performance in terms of latency compared to a recently proposed scheme by Prakash et al. without the shortcoming of additional leakage of private data. CodedSecAgg outperforms the state-of-the-art secure aggregation scheme LightSecAgg by a speed-up factor of 6.6-18.7 for the MNIST dataset for an accuracy of 95%

    Private Edge Computing for Linear Inference Based on Secret Sharing

    No full text
    We consider an edge computing scenario where users want to perform a linear computation on local, private data and a network-wide, public matrix. Users offload computations to edge servers located at the edge of the network, but do not want the servers, or any other party with access to the wireless links, to gain any information about their data. We provide a scheme that guarantees information-theoretic user data privacy against an eavesdropper with access to a number of edge servers or their corresponding communication links. The novelty of the proposed scheme lies in the utilization of secret sharing and partial replication to provide privacy, mitigate the effect of straggling servers, and to allow for joint beamforming opportunities in the download phase, to minimize the overall latency, consisting of upload, computation, and download latencies

    Privacy-Preserving Coded Mobile Edge Computing for Low-Latency Distributed Inference

    No full text
    We consider a mobile edge computing scenario where a number of devices want to perform a linear inference Wx on some local data x given a network-side matrix W. The computation is performed at the network edge over a number of edge servers. We propose a coding scheme that provides information-theoretic privacy against z colluding (honest-but-curious) edge servers, while minimizing the overall latency—comprising upload, computation, download, and decoding latency—in the presence of straggling servers. The proposed scheme exploits Shamir’s secret sharing to yield data privacy and straggler mitigation, combined with replication to provide spatial diversity for the download. We also propose two variants of the scheme that further reduce latency. For a considered scenario with 9 edge servers, the proposed scheme reduces the latency by 8% compared to the nonprivate scheme recently introduced by Zhang and Simeone, while providing privacy against an honestbut-curious edge server

    Straggler-Resilient Secure Aggregation for Federated Learning

    No full text
    We present CodedSecAgg, a straggler-resilient secure aggregation scheme for federated learning. CodedSecAgg introduces redundancy on the devices\u27 data across the network, which is leveraged during the iterative learning phase at the central server to update the global model based on the responses of a subset of the devices. Compared to other schemes in the literature, which deal with device dropouts by ignoring the contribution of dropped devices, the proposed scheme does not suffer from the client-drift problem. We apply CodedSecAgg to a classification problem on the MNIST dataset. For a scenario with 120 devices, we show that CodedSecAgg outperforms state-of-the-art LightSecAgg in terms of latency by a factor of 6.6 to 15.8, depending on the number of colluding agents, for an accuracy of 95%

    Straggler-Resilient Secure Aggregation for Federated Learning

    No full text
    We present CodedSecAgg, a straggler-resilient secure aggregation scheme for federated learning. CodedSecAgg introduces redundancy on the devices\u27 data across the network, which is leveraged during the iterative learning phase at the central server to update the global model based on the responses of a subset of the devices. Compared to other schemes in the literature, which deal with device dropouts by ignoring the contribution of dropped devices, the proposed scheme does not suffer from the client-drift problem. We apply CodedSecAgg to a classification problem on the MNIST dataset. For a scenario with 120 devices, we show that CodedSecAgg outperforms state-of-the-art LightSecAgg in terms of latency by a factor of 6.6 to 15.8, depending on the number of colluding agents, for an accuracy of 95%
    corecore