57 research outputs found

    An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization

    Full text link
    We study the complexity of producing (δ,ϵ)(\delta,\epsilon)-stationary points of Lipschitz objectives which are possibly neither smooth nor convex, using only noisy function evaluations. Recent works proposed several stochastic zero-order algorithms that solve this task, all of which suffer from a dimension-dependence of Ω(d3/2)\Omega(d^{3/2}) where dd is the dimension of the problem, which was conjectured to be optimal. We refute this conjecture by providing a faster algorithm that has complexity O(dδ−1ϵ−3)O(d\delta^{-1}\epsilon^{-3}), which is optimal (up to numerical constants) with respect to dd and also optimal with respect to the accuracy parameters δ,ϵ\delta,\epsilon, thus solving an open question due to Lin et al. (NeurIPS'22). Moreover, the convergence rate achieved by our algorithm is also optimal for smooth objectives, proving that in the nonconvex stochastic zero-order setting, nonsmooth optimization is as easy as smooth optimization. We provide algorithms that achieve the aforementioned convergence rate in expectation as well as with high probability. Our analysis is based on a simple yet powerful geometric lemma regarding the Goldstein-subdifferential set, which allows utilizing recent advancements in first-order nonsmooth nonconvex optimization.Comment: 12 page

    From Tempered to Benign Overfitting in ReLU Neural Networks

    Full text link
    Overparameterized neural networks (NNs) are observed to generalize well even when trained to perfectly fit noisy data. This phenomenon motivated a large body of work on "benign overfitting", where interpolating predictors achieve near-optimal performance. Recently, it was conjectured and empirically observed that the behavior of NNs is often better described as "tempered overfitting", where the performance is non-optimal yet also non-trivial, and degrades as a function of the noise level. However, a theoretical justification of this claim for non-linear NNs has been lacking so far. In this work, we provide several results that aim at bridging these complementing views. We study a simple classification setting with 2-layer ReLU NNs, and prove that under various assumptions, the type of overfitting transitions from tempered in the extreme case of one-dimensional data, to benign in high dimensions. Thus, we show that the input dimension has a crucial role on the type of overfitting in this setting, which we also validate empirically for intermediate dimensions. Overall, our results shed light on the intricate connections between the dimension, sample size, architecture and training algorithm on the one hand, and the type of resulting overfitting on the other hand.Comment: 43 page

    Relationship obsessive-compulsive disorder: interference, symptoms, and maladaptive beliefs

    Get PDF
    BACKGROUND: Obsessive preoccupation, doubts, and compulsive behaviors focusing on one\u27s romantic relationship and partner are receiving increasing clinical, theoretical, and empirical attention. Commonly referred to as relationship obsessive-compulsive disorder (ROCD), such symptoms have been linked with decreased relational and sexual functioning and lower mood, even after controlling for other obsessive-compulsive disorder (OCD) symptoms. To date, however, these symptoms have been studied in community samples alone. In the present study, we compared levels of interference, OCD, and mood symptoms between clinical participants with ROCD, OCD, and community controls. We also examined group differences in maladaptive beliefs previously linked with OCD and ROCD. METHOD: Participants included 22 ROCD clients, 22 OCD clients, and 28 community controls. The Mini International Neuropsychiatric Interview was used to attain clinical diagnoses of OCD and ROCD. The Yale-Brown Obsessive-Compulsive Scale was used to evaluate primary-symptoms severity. All participants completed measures of symptoms and dysfunctional beliefs. RESULTS: ROCD clients reported more severe ROCD symptoms than the OCD and control groups. ROCD and OCD clients did not differ in severity of their -primary-symptoms. ROCD clients scored higher than the other groups on maladaptive OCD-related and relationship-related beliefs. Finally, ROCD clients showed more severe depression symptoms than community controls. CONCLUSION: ROCD is a disabling presentation of OCD that warrants research attention. Maladaptive OCD-related and relationship-related beliefs may be implicated in the development and maintenance of ROCD

    Deterministic Nonsmooth Nonconvex Optimization

    Full text link
    We study the complexity of optimizing nonsmooth nonconvex Lipschitz functions by producing (δ,ϵ)(\delta,\epsilon)-stationary points. Several recent works have presented randomized algorithms that produce such points using O~(δ−1ϵ−3)\tilde O(\delta^{-1}\epsilon^{-3}) first-order oracle calls, independent of the dimension dd. It has been an open problem as to whether a similar result can be obtained via a deterministic algorithm. We resolve this open problem, showing that randomization is necessary to obtain a dimension-free rate. In particular, we prove a lower bound of Ω(d)\Omega(d) for any deterministic algorithm. Moreover, we show that unlike smooth or convex optimization, access to function values is required for any deterministic algorithm to halt within any finite time. On the other hand, we prove that if the function is even slightly smooth, then the dimension-free rate of O~(δ−1ϵ−3)\tilde O(\delta^{-1}\epsilon^{-3}) can be obtained by a deterministic algorithm with merely a logarithmic dependence on the smoothness parameter. Motivated by these findings, we turn to study the complexity of deterministically smoothing Lipschitz functions. Though there are efficient black-box randomized smoothings, we start by showing that no such deterministic procedure can smooth functions in a meaningful manner, resolving an open question. We then bypass this impossibility result for the structured case of ReLU neural networks. To that end, in a practical white-box setting in which the optimizer is granted access to the network's architecture, we propose a simple, dimension-free, deterministic smoothing that provably preserves (δ,ϵ)(\delta,\epsilon)-stationary points. Our method applies to a variety of architectures of arbitrary depth, including ResNets and ConvNets. Combined with our algorithm, this yields the first deterministic dimension-free algorithm for optimizing ReLU networks, circumventing our lower bound.Comment: This work supersedes arxiv:2209.12463 and arxiv:2209.10346[Section 3], with major additional result

    Non-empirical prediction of the length-dependent ionization potential in molecular chains

    Full text link
    The ionization potential of molecular chains is well-known to be a tunable nano-scale property that exhibits clear quantum confinement effects. State-of-the-art methods can accurately predict the ionization potential in the small molecule limit and in the solid-state limit, but for intermediate, nano-sized systems prediction of the evolution of the electronic structure between the two limits is more difficult. Recently, optimal tuning of range-separated hybrid functionals has emerged as a highly accurate method for predicting ionization potentials. This was first achieved for molecules using the ionization potential theorem (IPT) and more recently extended to solid-state systems, based on an \textit{ansatz} that generalizes the IPT to the removal of charge from a localized Wannier function. Here, we study one-dimensional molecular chains of increasing size, from the monomer limit to the infinite polymer limit using this approach. By comparing our results with other localization-based methods and where available with experiment, we demonstrate that Wannier-localization-based optimal tuning is highly accurate in predicting ionization potentials for any chain length, including the nano-scale regime

    Conduction delays across the specialized conduction system of the heart: Revisiting atrioventricular node (AVN) and Purkinje-ventricular junction (PVJ) delays

    Get PDF
    Background and significanceThe specialized conduction system (SCS) of the heart was extensively studied to understand the synchronization of atrial and ventricular contractions, the large atrial to His bundle (A-H) delay through the atrioventricular node (AVN), and delays between Purkinje (P) and ventricular (V) depolarization at distinct junctions (J), PVJs. Here, we use optical mapping of perfused rabbit hearts to revisit the mechanism that explains A-H delay and the role of a passive electrotonic step-delay at the boundary between atria and the AVN. We further visualize how the P anatomy controls papillary activation and valve closure before ventricular activation.MethodsRabbit hearts were perfused with a bolus (100–200 µl) of a voltage-sensitive dye (di4ANEPPS), blebbistatin (10–20 µM for 20 min) then the right atrial appendage and ventricular free-wall were cut to expose the AVN, P fibers (PFs), the septum, papillary muscles, and the endocardium. Fluorescence images were focused on a CMOS camera (SciMedia) captured at 1K-5 K frames/s from 100 × 100 pixels.ResultsAP propagation across the AVN-His (A-H) exhibits distinct patterns of delay and conduction blocks during S1–S2 stimulation. Refractory periods were 81 ± 9, 90 ± 21, 185 ± 15 ms for Atrial, AVN, and His, respectively. A large delay (>40 ms) occurs between atrial and AVN activation that increased during rapid atrial pacing contributing to the development of Wenckebach periodicity followed by delays within the AVN through slow or blocked conduction. The temporal resolution of the camera allowed us to identify PVJs by detecting doublets of AP upstrokes. PVJ delays were heterogeneous, fastest in PVJ that immediately trigger ventricular APs (3.4 ± 0.8 ms) and slow in regions where PF appear insulated from the neighboring ventricular myocytes (7.8 ± 2.4 ms). Insulated PF along papillary muscles conducted APs (>2 m/s), then triggered papillary muscle APs (<1 m/s), followed by APs firing of septum and endocardium. The anatomy of PFs and PVJs produced activation patterns that control the sequence of contractions ensuring that papillary contractions close the tricuspid valve 2–5 ms before right ventricular contractions.ConclusionsThe specialized conduction system can be accessed optically to investigate the electrical properties of the AVN, PVJ and activation patterns in physiological and pathological conditions

    Band gaps of crystalline solids from Wannier-localization based optimal tuning of a screened range-separated hybrid functional

    Full text link
    Accurate prediction of fundamental band gaps of crystalline solid state systems entirely within density functional theory is a long standing challenge. Here, we present a simple and inexpensive method that achieves this by means of non-empirical optimal tuning of the parameters of a screened range-separated hybrid functional. The tuning involves the enforcement of an ansatz that generalizes the ionization potential theorem to the removal of an electron in an occupied state described by a localized Wannier function in a modestly sized supercell calculation. The method is benchmarked against experiment for a set of systems ranging from narrow band gap semiconductors to large band gap insulators, spanning a range of fundamental band gaps from 0.2 to 14.2 eV and is found to yield quantitative accuracy across the board, with a mean absolute error of ∼\sim0.1 eV and a maximal error of ∼\sim0.2 eV.Comment: 10 pages, 2 figure

    Optical absorption spectra of metal oxides from time-dependent density functional theory and many-body perturbation theory based on optimally-tuned hybrid functionals

    Full text link
    Using both time-dependent density functional theory (TDDFT) and the ``single-shot" GWGW plus Bethe-Salpeter equation (GWGW-BSE) approach, we compute optical band gaps and optical absorption spectra from first principles for eight common binary and ternary closed-shell metal oxides (MgO, Al2_2O3_3, CaO, TiO2_2, Cu2_2O, ZnO, BaSnO3_3, and BiVO4_4), based on the non-empirical Wannier-localized optimally-tuned screened range-separated hybrid functional. Overall, we find excellent agreement between our TDDFT and GWGW-BSE results and experiment, with a mean absolute error less than 0.4 eV, including for Cu2_2O and ZnO, traditionally considered to be challenging for both methods

    Verifying atomicity via data independence

    Full text link
    We present a technique for automatically verifying atomicity of composed concurrent operations. The main observation behind our approach is that many composed concurrent operations which oc-cur in practice are data-independent. That is, the control-flow of the composed operation does not depend on specific input values. While verifying data-independence is undecidable in the general case, we provide succint sufficient conditions that can be used to establish a composed operation as data-independent. We show that for the common case of concurrent maps, data-independence re-duces the hard problem of verifying linearizability to a verification problem that can be solved efficiently with a bounded number of keys and values. We implemented our approach in a tool called VINE and evalu-ated it on all composed operations from 57 real-world applications (112 composed operations). We show that many composed opera-tions (49 out of 112) are data-independent, and automatically verify 30 of them as linearizable and the rest 19 as having violations of linearizability that could be repaired and then subsequently auto-matically verified. Moreover, we show that the remaining 63 oper-ations are not linearizable, thus indicating that data independence does not limit the expressiveness of writing realistic linearizable composed operations. Categories and Subject Descriptors
    • …
    corecore