29 research outputs found
Gradient type optimization methods for electronic structure calculations
The density functional theory (DFT) in electronic structure calculations can
be formulated as either a nonlinear eigenvalue or direct minimization problem.
The most widely used approach for solving the former is the so-called
self-consistent field (SCF) iteration. A common observation is that the
convergence of SCF is not clear theoretically while approaches with convergence
guarantee for solving the latter are often not competitive to SCF numerically.
In this paper, we study gradient type methods for solving the direct
minimization problem by constructing new iterations along the gradient on the
Stiefel manifold. Global convergence (i.e., convergence to a stationary point
from any initial solution) as well as local convergence rate follows from the
standard theory for optimization on manifold directly. A major computational
advantage is that the computation of linear eigenvalue problems is no longer
needed. The main costs of our approaches arise from the assembling of the total
energy functional and its gradient and the projection onto the manifold. These
tasks are cheaper than eigenvalue computation and they are often more suitable
for parallelization as long as the evaluation of the total energy functional
and its gradient is efficient. Numerical results show that they can outperform
SCF consistently on many practically large systems.Comment: 24 pages, 11 figures, 59 references, and 1 acknowledgement
Energy-adaptive Riemannian optimization on the Stiefel manifold
This paper addresses the numerical solution of nonlinear eigenvector problems
such as the Gross-Pitaevskii and Kohn-Sham equation arising in computational
physics and chemistry. These problems characterize critical points of energy
minimization problems on the infinite-dimensional Stiefel manifold. To
efficiently compute minimizers, we propose a novel Riemannian gradient descent
method induced by an energy-adaptive metric. Quantified convergence of the
methods is established under suitable assumptions on the underlying problem. A
non-monotone line search and the inexact evaluation of Riemannian gradients
substantially improve the overall efficiency of the method. Numerical
experiments illustrate the performance of the method and demonstrates its
competitiveness with well-established schemes.Comment: accepted for publication in M2A
Federated Learning for Sparse Principal Component Analysis
In the rapidly evolving realm of machine learning, algorithm effectiveness
often faces limitations due to data quality and availability. Traditional
approaches grapple with data sharing due to legal and privacy concerns. The
federated learning framework addresses this challenge. Federated learning is a
decentralized approach where model training occurs on client sides, preserving
privacy by keeping data localized. Instead of sending raw data to a central
server, only model updates are exchanged, enhancing data security. We apply
this framework to Sparse Principal Component Analysis (SPCA) in this work. SPCA
aims to attain sparse component loadings while maximizing data variance for
improved interpretability. Beside the L1 norm regularization term in
conventional SPCA, we add a smoothing function to facilitate gradient-based
optimization methods. Moreover, in order to improve computational efficiency,
we introduce a least squares approximation to original SPCA. This enables
analytic solutions on the optimization processes, leading to substantial
computational improvements. Within the federated framework, we formulate SPCA
as a consensus optimization problem, which can be solved using the Alternating
Direction Method of Multipliers (ADMM). Our extensive experiments involve both
IID and non-IID random features across various data owners. Results on
synthetic and public datasets affirm the efficacy of our federated SPCA
approach.Comment: 11 pages, 7 figures, 1 table. Accepted by IEEE BigData 2023,
Sorrento, Ital
Energy-adaptive Riemannian optimization on the Stiefel manifold
This paper addresses the numerical solution of nonlinear eigenvector problems such as the Gross-Pitaevskii and Kohn-Sham equation arising in computational physics and chemistry. These problems characterize critical points of energy minimization problems on the infinite-dimensional Stiefel manifold. To efficiently compute minimizers, we propose a novel Riemannian gradient descent method induced by an energyadaptive metric. Quantified convergence of the methods is established under suitable assumptions on the underlying problem. A non-monotone line search and the inexact evaluation of Riemannian gradients substantially improve the overall efficiency of the method. Numerical experiments illustrate the performance of the method and demonstrates its competitiveness with well-established schemes
Riemannian Conjugate Gradient Methods: General Framework and Specific Algorithms with Convergence Analyses
Conjugate gradient methods are important first-order optimization algorithms both in Euclidean spaces and on Riemannian manifolds. However, while various types of conjugate gradient methods have been studied in Euclidean spaces, there are relatively fewer studies for those on Riemannian manifolds (i.e., Riemannian conjugate gradient methods). This paper proposes a novel general framework that unifies existing Riemannian conjugate gradient methods such as the ones that utilize a vector transport or inverse retraction. The proposed framework also develops other methods that have not been covered in previous studies. Furthermore, conditions for the convergence of a class of algorithms in the proposed framework are clarified. Moreover, the global convergence properties of several specific types of algorithms are extensively analyzed. The analysis provides the theoretical results for some algorithms in a more general setting than the existing studies and new developments for other algorithms. Numerical experiments are performed to confirm the validity of the theoretical results. The experimental results are used to compare the performances of several specific algorithms in the proposed framework
Multi-Rank Sparse and Functional PCA: Manifold Optimization and Iterative Deflation Techniques
We consider the problem of estimating multiple principal components using the
recently-proposed Sparse and Functional Principal Components Analysis (SFPCA)
estimator. We first propose an extension of SFPCA which estimates several
principal components simultaneously using manifold optimization techniques to
enforce orthogonality constraints. While effective, this approach is
computationally burdensome so we also consider iterative deflation approaches
which take advantage of existing fast algorithms for rank-one SFPCA. We show
that alternative deflation schemes can more efficiently extract signal from the
data, in turn improving estimation of subsequent components. Finally, we
compare the performance of our manifold optimization and deflation techniques
in a scenario where orthogonality does not hold and find that they still lead
to significantly improved performance.Comment: To appear in IEEE CAMSAP 201
Stochastic First-Order Learning for Large-Scale Flexibly Tied Gaussian Mixture Model
Gaussian Mixture Models (GMM) are one of the most potent parametric density
estimators based on the kernel model that finds application in many scientific
domains. In recent years, with the dramatic enlargement of data sources,
typical machine learning algorithms, e.g. Expectation Maximization (EM),
encounters difficulty with high-dimensional and streaming data. Moreover,
complicated densities often demand a large number of Gaussian components. This
paper proposes a fast online parameter estimation algorithm for GMM by using
first-order stochastic optimization. This approach provides a framework to cope
with the challenges of GMM when faced with high-dimensional streaming data and
complex densities by leveraging the flexibly-tied factorization of the
covariance matrix. A new stochastic Manifold optimization algorithm that
preserves the orthogonality is introduced and used along with the well-known
Euclidean space numerical optimization. Numerous empirical results on both
synthetic and real datasets justify the effectiveness of our proposed
stochastic method over EM-based methods in the sense of better-converged
maximum for likelihood function, fewer number of needed epochs for convergence,
and less time consumption per epoch