1,133 research outputs found
Augmentation Invariant Manifold Learning
Data augmentation is a widely used technique and an essential ingredient in
the recent advance in self-supervised representation learning. By preserving
the similarity between augmented data, the resulting data representation can
improve various downstream analyses and achieve state-of-art performance in
many applications. To demystify the role of data augmentation, we develop a
statistical framework on a low-dimension product manifold to theoretically
understand why the unlabeled augmented data can lead to useful data
representation. Under this framework, we propose a new representation learning
method called augmentation invariant manifold learning and develop the
corresponding loss function, which can work with a deep neural network to learn
data representations. Compared with existing methods, the new data
representation simultaneously exploits the manifold's geometric structure and
invariant property of augmented data. Our theoretical investigation precisely
characterizes how the data representation learned from augmented data can
improve the -nearest neighbor classifier in the downstream analysis, showing
that a more complex data augmentation leads to more improvement in downstream
analysis. Finally, numerical experiments on simulated and real datasets are
presented to support the theoretical results in this paper
On Linear Separation Capacity of Self-Supervised Representation Learning
Recent advances in self-supervised learning have highlighted the efficacy of
data augmentation in learning data representation from unlabeled data. Training
a linear model atop these enhanced representations can yield an adept
classifier. Despite the remarkable empirical performance, the underlying
mechanisms that enable data augmentation to unravel nonlinear data structures
into linearly separable representations remain elusive. This paper seeks to
bridge this gap by investigating under what conditions learned representations
can linearly separate manifolds when data is drawn from a multi-manifold model.
Our investigation reveals that data augmentation offers additional information
beyond observed data and can thus improve the information-theoretic optimal
rate of linear separation capacity. In particular, we show that self-supervised
learning can linearly separate manifolds with a smaller distance than
unsupervised learning, underscoring the additional benefits of data
augmentation. Our theoretical analysis further underscores that the performance
of downstream linear classifiers primarily hinges on the linear separability of
data representations rather than the size of the labeled data set, reaffirming
the viability of constructing efficient classifiers with limited labeled data
amid an expansive unlabeled data set
Ball-milled FeP/graphite as a low-cost anode material for the sodium-ion battery
Phosphorus is a promising anode material for sodium batteries with a theoretical capacity of 2596 mA h g-1. However, phosphorus has a low electrical conductivity of 1 x 10-14 S cm-1, which results in poor cycling and rate performances. Even if it is alloyed with conductive Fe, it still delivers a poor electrochemical performance. In this article, a FeP/graphite composite has been synthesized using a simple, cheap, and productive method of low energy ball-milling, which is an efficient way to improve the electrical conductivity of the FeP compound. The cycling performance was improved significantly, and when the current density increased to 500 mA g-1, the FeP/graphite composite could still deliver 134 mA h g-1, which was more than twice the capacity of the FeP compound alone. Our results suggest that by using a low-energy ball-milling method, a promising FeP/graphite anode material can be synthesized for the sodium battery
- …