Distributed online learning has been proven extremely effective in solving
large-scale machine learning problems over streaming data. However, information
sharing between learners in distributed learning also raises concerns about the
potential leakage of individual learners' sensitive data. To mitigate this
risk, differential privacy, which is widely regarded as the "gold standard" for
privacy protection, has been widely employed in many existing results on
distributed online learning. However, these results often face a fundamental
tradeoff between learning accuracy and privacy. In this paper, we propose a
locally differentially private gradient tracking based distributed online
learning algorithm that successfully circumvents this tradeoff. We prove that
the proposed algorithm converges in mean square to the exact optimal solution
while ensuring rigorous local differential privacy, with the cumulative privacy
budget guaranteed to be finite even when the number of iterations tends to
infinity. The algorithm is applicable even when the communication graph among
learners is directed. To the best of our knowledge, this is the first result
that simultaneously ensures learning accuracy and rigorous local differential
privacy in distributed online learning over directed graphs. We evaluate our
algorithm's performance by using multiple benchmark machine-learning
applications, including logistic regression of the "Mushrooms" dataset and
CNN-based image classification of the "MNIST" and "CIFAR-10" datasets,
respectively. The experimental results confirm that the proposed algorithm
outperforms existing counterparts in both training and testing accuracies.Comment: 21 pages, 4 figure