57 research outputs found
An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization
We study the complexity of producing -stationary points of
Lipschitz objectives which are possibly neither smooth nor convex, using only
noisy function evaluations. Recent works proposed several stochastic zero-order
algorithms that solve this task, all of which suffer from a
dimension-dependence of where is the dimension of the
problem, which was conjectured to be optimal. We refute this conjecture by
providing a faster algorithm that has complexity
, which is optimal (up to numerical constants)
with respect to and also optimal with respect to the accuracy parameters
, thus solving an open question due to Lin et al.
(NeurIPS'22). Moreover, the convergence rate achieved by our algorithm is also
optimal for smooth objectives, proving that in the nonconvex stochastic
zero-order setting, nonsmooth optimization is as easy as smooth optimization.
We provide algorithms that achieve the aforementioned convergence rate in
expectation as well as with high probability. Our analysis is based on a simple
yet powerful geometric lemma regarding the Goldstein-subdifferential set, which
allows utilizing recent advancements in first-order nonsmooth nonconvex
optimization.Comment: 12 page
From Tempered to Benign Overfitting in ReLU Neural Networks
Overparameterized neural networks (NNs) are observed to generalize well even
when trained to perfectly fit noisy data. This phenomenon motivated a large
body of work on "benign overfitting", where interpolating predictors achieve
near-optimal performance. Recently, it was conjectured and empirically observed
that the behavior of NNs is often better described as "tempered overfitting",
where the performance is non-optimal yet also non-trivial, and degrades as a
function of the noise level. However, a theoretical justification of this claim
for non-linear NNs has been lacking so far. In this work, we provide several
results that aim at bridging these complementing views. We study a simple
classification setting with 2-layer ReLU NNs, and prove that under various
assumptions, the type of overfitting transitions from tempered in the extreme
case of one-dimensional data, to benign in high dimensions. Thus, we show that
the input dimension has a crucial role on the type of overfitting in this
setting, which we also validate empirically for intermediate dimensions.
Overall, our results shed light on the intricate connections between the
dimension, sample size, architecture and training algorithm on the one hand,
and the type of resulting overfitting on the other hand.Comment: 43 page
Relationship obsessive-compulsive disorder: interference, symptoms, and maladaptive beliefs
BACKGROUND: Obsessive preoccupation, doubts, and compulsive behaviors focusing on one\u27s romantic relationship and partner are receiving increasing clinical, theoretical, and empirical attention. Commonly referred to as relationship obsessive-compulsive disorder (ROCD), such symptoms have been linked with decreased relational and sexual functioning and lower mood, even after controlling for other obsessive-compulsive disorder (OCD) symptoms. To date, however, these symptoms have been studied in community samples alone. In the present study, we compared levels of interference, OCD, and mood symptoms between clinical participants with ROCD, OCD, and community controls. We also examined group differences in maladaptive beliefs previously linked with OCD and ROCD. METHOD: Participants included 22 ROCD clients, 22 OCD clients, and 28 community controls. The Mini International Neuropsychiatric Interview was used to attain clinical diagnoses of OCD and ROCD. The Yale-Brown Obsessive-Compulsive Scale was used to evaluate primary-symptoms severity. All participants completed measures of symptoms and dysfunctional beliefs. RESULTS: ROCD clients reported more severe ROCD symptoms than the OCD and control groups. ROCD and OCD clients did not differ in severity of their -primary-symptoms. ROCD clients scored higher than the other groups on maladaptive OCD-related and relationship-related beliefs. Finally, ROCD clients showed more severe depression symptoms than community controls. CONCLUSION: ROCD is a disabling presentation of OCD that warrants research attention. Maladaptive OCD-related and relationship-related beliefs may be implicated in the development and maintenance of ROCD
Recommended from our members
Nonempirical Prediction of the Length-Dependent Ionization Potential in Molecular Chains.
The ionization potential of molecular chains is well-known to be a tunable nanoscale property that exhibits clear quantum confinement effects. State-of-the-art methods can accurately predict the ionization potential in the small molecule limit and in the solid-state limit, but for intermediate, nanosized systems prediction of the evolution of the electronic structure between the two limits is more difficult. Recently, optimal tuning of range-separated hybrid functionals has emerged as a highly accurate method for predicting ionization potentials. This was first achieved for molecules using the ionization potential theorem (IPT) and more recently extended to solid-state systems, based on an ansatz that generalizes the IPT to the removal of charge from a localized Wannier function. Here, we study one-dimensional molecular chains of increasing size, from the monomer limit to the infinite polymer limit using this approach. By comparing our results with other localization-based methods and where available with experiment, we demonstrate that Wannier-localization-based optimal tuning is highly accurate in predicting ionization potentials for any chain length, including the nanoscale regime
Deterministic Nonsmooth Nonconvex Optimization
We study the complexity of optimizing nonsmooth nonconvex Lipschitz functions
by producing -stationary points. Several recent works have
presented randomized algorithms that produce such points using first-order oracle calls, independent of the
dimension . It has been an open problem as to whether a similar result can
be obtained via a deterministic algorithm. We resolve this open problem,
showing that randomization is necessary to obtain a dimension-free rate. In
particular, we prove a lower bound of for any deterministic
algorithm. Moreover, we show that unlike smooth or convex optimization, access
to function values is required for any deterministic algorithm to halt within
any finite time.
On the other hand, we prove that if the function is even slightly smooth,
then the dimension-free rate of can be
obtained by a deterministic algorithm with merely a logarithmic dependence on
the smoothness parameter. Motivated by these findings, we turn to study the
complexity of deterministically smoothing Lipschitz functions. Though there are
efficient black-box randomized smoothings, we start by showing that no such
deterministic procedure can smooth functions in a meaningful manner, resolving
an open question. We then bypass this impossibility result for the structured
case of ReLU neural networks. To that end, in a practical white-box setting in
which the optimizer is granted access to the network's architecture, we propose
a simple, dimension-free, deterministic smoothing that provably preserves
-stationary points. Our method applies to a variety of
architectures of arbitrary depth, including ResNets and ConvNets. Combined with
our algorithm, this yields the first deterministic dimension-free algorithm for
optimizing ReLU networks, circumventing our lower bound.Comment: This work supersedes arxiv:2209.12463 and arxiv:2209.10346[Section
3], with major additional result
Non-empirical prediction of the length-dependent ionization potential in molecular chains
The ionization potential of molecular chains is well-known to be a tunable
nano-scale property that exhibits clear quantum confinement effects.
State-of-the-art methods can accurately predict the ionization potential in the
small molecule limit and in the solid-state limit, but for intermediate,
nano-sized systems prediction of the evolution of the electronic structure
between the two limits is more difficult. Recently, optimal tuning of
range-separated hybrid functionals has emerged as a highly accurate method for
predicting ionization potentials. This was first achieved for molecules using
the ionization potential theorem (IPT) and more recently extended to
solid-state systems, based on an \textit{ansatz} that generalizes the IPT to
the removal of charge from a localized Wannier function. Here, we study
one-dimensional molecular chains of increasing size, from the monomer limit to
the infinite polymer limit using this approach. By comparing our results with
other localization-based methods and where available with experiment, we
demonstrate that Wannier-localization-based optimal tuning is highly accurate
in predicting ionization potentials for any chain length, including the
nano-scale regime
Conduction delays across the specialized conduction system of the heart: Revisiting atrioventricular node (AVN) and Purkinje-ventricular junction (PVJ) delays
Background and significanceThe specialized conduction system (SCS) of the heart was extensively studied to understand the synchronization of atrial and ventricular contractions, the large atrial to His bundle (A-H) delay through the atrioventricular node (AVN), and delays between Purkinje (P) and ventricular (V) depolarization at distinct junctions (J), PVJs. Here, we use optical mapping of perfused rabbit hearts to revisit the mechanism that explains A-H delay and the role of a passive electrotonic step-delay at the boundary between atria and the AVN. We further visualize how the P anatomy controls papillary activation and valve closure before ventricular activation.MethodsRabbit hearts were perfused with a bolus (100–200 µl) of a voltage-sensitive dye (di4ANEPPS), blebbistatin (10–20 µM for 20 min) then the right atrial appendage and ventricular free-wall were cut to expose the AVN, P fibers (PFs), the septum, papillary muscles, and the endocardium. Fluorescence images were focused on a CMOS camera (SciMedia) captured at 1K-5 K frames/s from 100 × 100 pixels.ResultsAP propagation across the AVN-His (A-H) exhibits distinct patterns of delay and conduction blocks during S1–S2 stimulation. Refractory periods were 81 ± 9, 90 ± 21, 185 ± 15 ms for Atrial, AVN, and His, respectively. A large delay (>40 ms) occurs between atrial and AVN activation that increased during rapid atrial pacing contributing to the development of Wenckebach periodicity followed by delays within the AVN through slow or blocked conduction. The temporal resolution of the camera allowed us to identify PVJs by detecting doublets of AP upstrokes. PVJ delays were heterogeneous, fastest in PVJ that immediately trigger ventricular APs (3.4 ± 0.8 ms) and slow in regions where PF appear insulated from the neighboring ventricular myocytes (7.8 ± 2.4 ms). Insulated PF along papillary muscles conducted APs (>2 m/s), then triggered papillary muscle APs (<1 m/s), followed by APs firing of septum and endocardium. The anatomy of PFs and PVJs produced activation patterns that control the sequence of contractions ensuring that papillary contractions close the tricuspid valve 2–5 ms before right ventricular contractions.ConclusionsThe specialized conduction system can be accessed optically to investigate the electrical properties of the AVN, PVJ and activation patterns in physiological and pathological conditions
Band gaps of crystalline solids from Wannier-localization based optimal tuning of a screened range-separated hybrid functional
Accurate prediction of fundamental band gaps of crystalline solid state
systems entirely within density functional theory is a long standing challenge.
Here, we present a simple and inexpensive method that achieves this by means of
non-empirical optimal tuning of the parameters of a screened range-separated
hybrid functional. The tuning involves the enforcement of an ansatz that
generalizes the ionization potential theorem to the removal of an electron in
an occupied state described by a localized Wannier function in a modestly sized
supercell calculation. The method is benchmarked against experiment for a set
of systems ranging from narrow band gap semiconductors to large band gap
insulators, spanning a range of fundamental band gaps from 0.2 to 14.2 eV and
is found to yield quantitative accuracy across the board, with a mean absolute
error of 0.1 eV and a maximal error of 0.2 eV.Comment: 10 pages, 2 figure
Optical absorption spectra of metal oxides from time-dependent density functional theory and many-body perturbation theory based on optimally-tuned hybrid functionals
Using both time-dependent density functional theory (TDDFT) and the
``single-shot" plus Bethe-Salpeter equation (-BSE) approach, we
compute optical band gaps and optical absorption spectra from first principles
for eight common binary and ternary closed-shell metal oxides (MgO,
AlO, CaO, TiO, CuO, ZnO, BaSnO, and BiVO), based on the
non-empirical Wannier-localized optimally-tuned screened range-separated hybrid
functional. Overall, we find excellent agreement between our TDDFT and -BSE
results and experiment, with a mean absolute error less than 0.4 eV, including
for CuO and ZnO, traditionally considered to be challenging for both
methods
Verifying atomicity via data independence
We present a technique for automatically verifying atomicity of composed concurrent operations. The main observation behind our approach is that many composed concurrent operations which oc-cur in practice are data-independent. That is, the control-flow of the composed operation does not depend on specific input values. While verifying data-independence is undecidable in the general case, we provide succint sufficient conditions that can be used to establish a composed operation as data-independent. We show that for the common case of concurrent maps, data-independence re-duces the hard problem of verifying linearizability to a verification problem that can be solved efficiently with a bounded number of keys and values. We implemented our approach in a tool called VINE and evalu-ated it on all composed operations from 57 real-world applications (112 composed operations). We show that many composed opera-tions (49 out of 112) are data-independent, and automatically verify 30 of them as linearizable and the rest 19 as having violations of linearizability that could be repaired and then subsequently auto-matically verified. Moreover, we show that the remaining 63 oper-ations are not linearizable, thus indicating that data independence does not limit the expressiveness of writing realistic linearizable composed operations. Categories and Subject Descriptors
- …