9,982 research outputs found

    Convergence Theory of Learning Over-parameterized ResNet: A Full Characterization

    Full text link
    ResNet structure has achieved great empirical success since its debut. Recent work established the convergence of learning over-parameterized ResNet with a scaling factor τ=1/L\tau=1/L on the residual branch where LL is the network depth. However, it is not clear how learning ResNet behaves for other values of τ\tau. In this paper, we fully characterize the convergence theory of gradient descent for learning over-parameterized ResNet with different values of τ\tau. Specifically, with hiding logarithmic factor and constant coefficients, we show that for τ1/L\tau\le 1/\sqrt{L} gradient descent is guaranteed to converge to the global minma, and especially when τ1/L\tau\le 1/L the convergence is irrelevant of the network depth. Conversely, we show that for τ>L12+c\tau>L^{-\frac{1}{2}+c}, the forward output grows at least with rate LcL^c in expectation and then the learning fails because of gradient explosion for large LL. This means the bound τ1/L\tau\le 1/\sqrt{L} is sharp for learning ResNet with arbitrary depth. To the best of our knowledge, this is the first work that studies learning ResNet with full range of τ\tau.Comment: 31 page

    Assessment of density functional methods with correct asymptotic behavior

    Full text link
    Long-range corrected (LC) hybrid functionals and asymptotically corrected (AC) model potentials are two distinct density functional methods with correct asymptotic behavior. They are known to be accurate for properties that are sensitive to the asymptote of the exchange-correlation potential, such as the highest occupied molecular orbital energies and Rydberg excitation energies of molecules. To provide a comprehensive comparison, we investigate the performance of the two schemes and others on a very wide range of applications, including the asymptote problems, self-interaction-error problems, energy-gap problems, charge-transfer problems, and many others. The LC hybrid scheme is shown to consistently outperform the AC model potential scheme. In addition, to be consistent with the molecules collected in the IP131 database [Y.-S. Lin, C.-W. Tsai, G.-D. Li, and J.-D. Chai, J. Chem. Phys., 2012, 136, 154109], we expand the EA115 and FG115 databases to include, respectively, the vertical electron affinities and fundamental gaps of the additional 16 molecules, and develop a new database AE113 (113 atomization energies), consisting of accurate reference values for the atomization energies of the 113 molecules in IP131. These databases will be useful for assessing the accuracy of density functional methods.Comment: accepted for publication in Phys. Chem. Chem. Phys., 46 pages, 4 figures, supplementary material include

    Controllable coupling between a nanomechanical resonator and a coplanar-waveguide resonator via a superconducting flux qubit

    Full text link
    We study a tripartite quantum system consisting of a coplanar-waveguide (CPW) resonator and a nanomechanical resonator (NAMR) connected by a flux qubit, where the flux qubit has a large detuning from both resonators. By a unitray transformation and a second-order approximation, we obtain a strong and controllable (i.e., magnetic-field-dependent) effective coupling between the NAMR and the CPW resonator. Due to the strong coupling, vacuum Rabi splitting can be observed from the voltage-fluctuation spectrum of the CPW resonator. We further study the properties of single photon transport as inferred from the reflectance or equivalently the transmittance. We show that the reflectance and the corresponding phase shift spectra both exhibit doublet of narrow spectral features due to vacuum Rabi splitting. By tuning the external magnetic field, the reflectance and the phase shift can be varied from 0 to 1 and π-\pi to π\pi, respectively. The results indicate that this hybrid quantum system can act as a quantum router.Comment: 8 pages, 6 figure

    A supra-massive magnetar central engine for short GRB 130603B

    Full text link
    We show that the peculiar early optical and in particular X-ray afterglow emission of the short duration burst GRB 130603B can be explained by continuous energy injection into the blastwave from a supra-massive magnetar central engine. The observed energetics and temporal/spectral properties of the late infrared bump (i.e., the "kilonova") are also found consistent with emission from the ejecta launched during an NS-NS merger and powered by a magnetar central engine. The isotropic-equivalent kinetic energies of both the GRB blastwave and the kilonova are about Ek1051E_{\rm k}\sim 10^{51} erg, consistent with being powered by a near-isotropic magnetar wind. However, this relatively small value demands that most of the initial rotational energy of the magnetar (a few×1052 erg)(\sim {\rm a~ few \times 10^{52}~ erg}) is carried away by gravitational wave radiation. Our results suggest that (i) the progenitor of GRB 130603B would be a NS-NS binary system, whose merger product would be a supra-massive neutron star that lasted for about 1000\sim 1000 seconds; (ii) the equation-of-state of nuclear matter would be stiff enough to allow survival of a long-lived supra-massive neutron star, so that it is promising to detect bright electromagnetic counterparts of gravitational wave triggers without short GRB associations in the upcoming Advanced LIGO/Virgo era.Comment: Five pages including 1 Figure, to appear in ApJ
    corecore