568 research outputs found

    SACOBRA with Online Whitening for Solving Optimization Problems with High Conditioning

    Get PDF
    Real-world optimization problems often have expensive objective functions in terms of cost and time. It is desirable to find near-optimal solutions with very few function evaluations. Surrogate-assisted optimizers tend to reduce the required number of function evaluations by replacing the real function with an efficient mathematical model built on few evaluated points. Problems with a high condition number are a challenge for many surrogate-assisted optimizers including SACOBRA. To address such problems we propose a new online whitening operating in the black-box optimization paradigm. We show on a set of high-conditioning functions that online whitening tackles SACOBRA's early stagnation issue and reduces the optimization error by a factor between 10 to 1e12 as compared to the plain SACOBRA, though it imposes many extra function evaluations. Covariance matrix adaptation evolution strategy (CMA-ES) has for very high numbers of function evaluations even lower errors, whereas SACOBRA performs better in the expensive setting (less than 1e03 function evaluations). If we count all parallelizable function evaluations (population evaluation in CMA-ES, online whitening in our approach) as one iteration, then both algorithms have comparable strength even on the long run. This holds for problems with dimension D Algorithms and the Foundations of Software technolog

    Orthogonal SVD Covariance Conditioning and Latent Disentanglement

    Full text link
    Inserting an SVD meta-layer into neural networks is prone to make the covariance ill-conditioned, which could harm the model in the training stability and generalization abilities. In this paper, we systematically study how to improve the covariance conditioning by enforcing orthogonality to the Pre-SVD layer. Existing orthogonal treatments on the weights are first investigated. However, these techniques can improve the conditioning but would hurt the performance. To avoid such a side effect, we propose the Nearest Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR). The effectiveness of our methods is validated in two applications: decorrelated Batch Normalization (BN) and Global Covariance Pooling (GCP). Extensive experiments on visual recognition demonstrate that our methods can simultaneously improve covariance conditioning and generalization. The combinations with orthogonal weight can further boost the performance. Moreover, we show that our orthogonality techniques can benefit generative models for better latent disentanglement through a series of experiments on various benchmarks. Code is available at: \href{https://github.com/KingJamesSong/OrthoImproveCond}{https://github.com/KingJamesSong/OrthoImproveCond}.Comment: Accepted by IEEE T-PAMI. arXiv admin note: substantial text overlap with arXiv:2207.0211

    Applying neural networks for improving the MEG inverse solution

    Get PDF
    Magnetoencephalography (MEG) and electroencephalography (EEG) are appealing non-invasive methods for recording brain activity with high temporal resolution. However, locating the brain source currents from recordings picked up by the sensors on the scalp introduces an ill-posed inverse problem. The MEG inverse problem one of the most difficult inverse problems in medical imaging. The current standard in approximating the MEG inverse problem is to use multiple distributed inverse solutions – namely dSPM, sLORETA and L2 MNE – to estimate the source current distribution in the brain. This thesis investigates if these inverse solutions can be "post-processed" by a neural network to provide improved accuracy on source locations. Recently, deep neural networks have been used to approximate other ill-posed inverse medical imaging problems with accuracy comparable to current state-of- the-art inverse reconstruction algorithms. Neural networks are powerful tools for approximating problems with limited prior knowledge or problems that require high levels of abstraction. In this thesis a special case of a deep convolutional network, the U-Net, is applied to approximate the MEG inverse problem using the standard inverse solutions (dSPM, sLORETA and L2 MNE) as inputs. The U-Net is capable of learning non-linear relationships between the inputs and producing predictions about the site of single-dipole activation with higher accuracy than the L2 minimum-norm based inverse solutions with the following resolution metrics: dipole localization error (DLE), spatial dispersion (SD) and overall amplitude (OA). The U-Net model is stable and performs better in aforesaid resolution metrics than the inverse solutions with multi-dipole data previously unseen by the U-Net

    Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-11 Updates

    Full text link
    In this paper, we provide local and global convergence guarantees for recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the proposed algorithm is a simple alternating rank-11 update which is the alternating version of the tensor power iteration adapted for asymmetric tensors. Local convergence guarantees are established for third order tensors of rank kk in dd dimensions, when k=o(d1.5)k=o \bigl( d^{1.5} \bigr) and the tensor components are incoherent. Thus, we can recover overcomplete tensor decomposition. We also strengthen the results to global convergence guarantees under stricter rank condition k≤βdk \le \beta d (for arbitrary constant β>1\beta > 1) through a simple initialization procedure where the algorithm is initialized by top singular vectors of random tensor slices. Furthermore, the approximate local convergence guarantees for pp-th order tensors are also provided under rank condition k=o(dp/2)k=o \bigl( d^{p/2} \bigr). The guarantees also include tight perturbation analysis given noisy tensor.Comment: We have added an additional sub-algorithm to remove the (approximate) residual error left after the tensor power iteratio
    • …
    corecore