568 research outputs found
SACOBRA with Online Whitening for Solving Optimization Problems with High Conditioning
Real-world optimization problems often have expensive objective functions in terms of cost and time. It is desirable to find near-optimal solutions with very few function evaluations. Surrogate-assisted optimizers tend to reduce the required number of function evaluations by replacing the real function with an efficient mathematical model built on few evaluated points. Problems with a high condition number are a challenge for many surrogate-assisted optimizers including SACOBRA. To address such problems we propose a new online whitening operating in the black-box optimization paradigm. We show on a set of high-conditioning functions that online whitening tackles SACOBRA's early stagnation issue and reduces the optimization error by a factor between 10 to 1e12 as compared to the plain SACOBRA, though it imposes many extra function evaluations. Covariance matrix adaptation evolution strategy (CMA-ES) has for very high numbers of function evaluations even lower errors, whereas SACOBRA performs better in the expensive setting (less than 1e03 function evaluations). If we count all parallelizable function evaluations (population evaluation in CMA-ES, online whitening in our approach) as one iteration, then both algorithms have comparable strength even on the long run. This holds for problems with dimension D Algorithms and the Foundations of Software technolog
Orthogonal SVD Covariance Conditioning and Latent Disentanglement
Inserting an SVD meta-layer into neural networks is prone to make the
covariance ill-conditioned, which could harm the model in the training
stability and generalization abilities. In this paper, we systematically study
how to improve the covariance conditioning by enforcing orthogonality to the
Pre-SVD layer. Existing orthogonal treatments on the weights are first
investigated. However, these techniques can improve the conditioning but would
hurt the performance. To avoid such a side effect, we propose the Nearest
Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR). The effectiveness of
our methods is validated in two applications: decorrelated Batch Normalization
(BN) and Global Covariance Pooling (GCP). Extensive experiments on visual
recognition demonstrate that our methods can simultaneously improve covariance
conditioning and generalization. The combinations with orthogonal weight can
further boost the performance. Moreover, we show that our orthogonality
techniques can benefit generative models for better latent disentanglement
through a series of experiments on various benchmarks. Code is available at:
\href{https://github.com/KingJamesSong/OrthoImproveCond}{https://github.com/KingJamesSong/OrthoImproveCond}.Comment: Accepted by IEEE T-PAMI. arXiv admin note: substantial text overlap
with arXiv:2207.0211
Applying neural networks for improving the MEG inverse solution
Magnetoencephalography (MEG) and electroencephalography (EEG) are appealing non-invasive methods for recording brain activity with high temporal resolution. However, locating the brain source currents from recordings picked up by the sensors on the scalp introduces an ill-posed inverse problem. The MEG inverse problem one of the most difficult inverse problems in medical imaging. The current standard in approximating the MEG inverse problem is to use multiple distributed inverse solutions – namely dSPM, sLORETA and L2 MNE – to estimate the source current distribution in the brain. This thesis investigates if these inverse solutions can be "post-processed" by a neural network to provide improved accuracy on source locations.
Recently, deep neural networks have been used to approximate other ill-posed inverse medical imaging problems with accuracy comparable to current state-of- the-art inverse reconstruction algorithms. Neural networks are powerful tools for approximating problems with limited prior knowledge or problems that require high levels of abstraction. In this thesis a special case of a deep convolutional network, the U-Net, is applied to approximate the MEG inverse problem using the standard inverse solutions (dSPM, sLORETA and L2 MNE) as inputs.
The U-Net is capable of learning non-linear relationships between the inputs and producing predictions about the site of single-dipole activation with higher accuracy than the L2 minimum-norm based inverse solutions with the following resolution metrics: dipole localization error (DLE), spatial dispersion (SD) and overall amplitude (OA). The U-Net model is stable and performs better in aforesaid resolution metrics than the inverse solutions with multi-dipole data previously unseen by the U-Net
Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank- Updates
In this paper, we provide local and global convergence guarantees for
recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the
proposed algorithm is a simple alternating rank- update which is the
alternating version of the tensor power iteration adapted for asymmetric
tensors. Local convergence guarantees are established for third order tensors
of rank in dimensions, when and the tensor
components are incoherent. Thus, we can recover overcomplete tensor
decomposition. We also strengthen the results to global convergence guarantees
under stricter rank condition (for arbitrary constant ) through a simple initialization procedure where the algorithm is
initialized by top singular vectors of random tensor slices. Furthermore, the
approximate local convergence guarantees for -th order tensors are also
provided under rank condition . The guarantees also
include tight perturbation analysis given noisy tensor.Comment: We have added an additional sub-algorithm to remove the (approximate)
residual error left after the tensor power iteratio
- …