19,124 research outputs found

    On Penalty-based Bilevel Gradient Descent Method

    Full text link
    Bilevel optimization enjoys a wide range of applications in hyper-parameter optimization, meta-learning and reinforcement learning. However, bilevel optimization problems are difficult to solve. Recent progress on scalable bilevel algorithms mainly focuses on bilevel optimization problems where the lower-level objective is either strongly convex or unconstrained. In this work, we tackle the bilevel problem through the lens of the penalty method. We show that under certain conditions, the penalty reformulation recovers the solutions of the original bilevel problem. Further, we propose the penalty-based bilevel gradient descent (PBGD) algorithm and establish its finite-time convergence for the constrained bilevel problem without lower-level strong convexity. Experiments showcase the efficiency of the proposed PBGD algorithm.Comment: Improved Section 4 by removing a critical assumption; Added Section 5 and citation

    Latent Partition Implicit with Surface Codes for 3D Representation

    Full text link
    Deep implicit functions have shown remarkable shape modeling ability in various 3D computer vision tasks. One drawback is that it is hard for them to represent a 3D shape as multiple parts. Current solutions learn various primitives and blend the primitives directly in the spatial space, which still struggle to approximate the 3D shape accurately. To resolve this problem, we introduce a novel implicit representation to represent a single 3D shape as a set of parts in the latent space, towards both highly accurate and plausibly interpretable shape modeling. Our insight here is that both the part learning and the part blending can be conducted much easier in the latent space than in the spatial space. We name our method Latent Partition Implicit (LPI), because of its ability of casting the global shape modeling into multiple local part modeling, which partitions the global shape unity. LPI represents a shape as Signed Distance Functions (SDFs) using surface codes. Each surface code is a latent code representing a part whose center is on the surface, which enables us to flexibly employ intrinsic attributes of shapes or additional surface properties. Eventually, LPI can reconstruct both the shape and the parts on the shape, both of which are plausible meshes. LPI is a multi-level representation, which can partition a shape into different numbers of parts after training. LPI can be learned without ground truth signed distances, point normals or any supervision for part partition. LPI outperforms the latest methods under the widely used benchmarks in terms of reconstruction accuracy and modeling interpretability. Our code, data and models are available at https://github.com/chenchao15/LPI.Comment: 20pages,14figures. Accepted by ECCV 202

    Thermal tunability in terahertz metamaterials fabricated on strontium titanate single crystal substrates

    Full text link
    We report an experimental demonstration of thermal tuning of resonance frequency in a planar terahertz metamaterial consisting of a gold split-ring resonator array fabricated on a bulk single crystal strontium titanate (SrTiO3) substrate. Cooling the metamaterial starting from 409 K down to 150 K causes about 50% shift in resonance frequency as compare to its room temperature resonance, and there is very little variation in resonance strength. The resonance shift is due to the temperature-dependent refractive index (or the dielectric constant) of the strontium titanate. The experiment opens up avenues for designing tunable terahertz devices by exploiting the temperature sensitive characteristic of high dielectric constant substrates and complex metal oxide materials.Comment: 6 pages, 3 figures, accepted at Optics Letter

    On Fast-Converged Deep Reinforcement Learning for Optimal Dispatch of Large-Scale Power Systems under Transient Security Constraints

    Full text link
    Power system optimal dispatch with transient security constraints is commonly represented as Transient Security-Constrained Optimal Power Flow (TSC-OPF). Deep Reinforcement Learning (DRL)-based TSC-OPF trains efficient decision-making agents that are adaptable to various scenarios and provide solution results quickly. However, due to the high dimensionality of the state space and action spaces, as well as the non-smoothness of dynamic constraints, existing DRL-based TSC-OPF solution methods face a significant challenge of the sparse reward problem. To address this issue, a fast-converged DRL method for TSC-OPF is proposed in this paper. The Markov Decision Process (MDP) modeling of TSC-OPF is improved by reducing the observation space and smoothing the reward design, thus facilitating agent training. An improved Deep Deterministic Policy Gradient algorithm with Curriculum learning, Parallel exploration, and Ensemble decision-making (DDPG-CPEn) is introduced to drastically enhance the efficiency of agent training and the accuracy of decision-making. The effectiveness, efficiency, and accuracy of the proposed method are demonstrated through experiments in the IEEE 39-bus system and a practical 710-bus regional power grid. The source code of the proposed method is made public on GitHub.Comment: 10 pages, 11 figure
    • …
    corecore