19,124 research outputs found
On Penalty-based Bilevel Gradient Descent Method
Bilevel optimization enjoys a wide range of applications in hyper-parameter
optimization, meta-learning and reinforcement learning. However, bilevel
optimization problems are difficult to solve. Recent progress on scalable
bilevel algorithms mainly focuses on bilevel optimization problems where the
lower-level objective is either strongly convex or unconstrained. In this work,
we tackle the bilevel problem through the lens of the penalty method. We show
that under certain conditions, the penalty reformulation recovers the solutions
of the original bilevel problem. Further, we propose the penalty-based bilevel
gradient descent (PBGD) algorithm and establish its finite-time convergence for
the constrained bilevel problem without lower-level strong convexity.
Experiments showcase the efficiency of the proposed PBGD algorithm.Comment: Improved Section 4 by removing a critical assumption; Added Section 5
and citation
Latent Partition Implicit with Surface Codes for 3D Representation
Deep implicit functions have shown remarkable shape modeling ability in
various 3D computer vision tasks. One drawback is that it is hard for them to
represent a 3D shape as multiple parts. Current solutions learn various
primitives and blend the primitives directly in the spatial space, which still
struggle to approximate the 3D shape accurately. To resolve this problem, we
introduce a novel implicit representation to represent a single 3D shape as a
set of parts in the latent space, towards both highly accurate and plausibly
interpretable shape modeling. Our insight here is that both the part learning
and the part blending can be conducted much easier in the latent space than in
the spatial space. We name our method Latent Partition Implicit (LPI), because
of its ability of casting the global shape modeling into multiple local part
modeling, which partitions the global shape unity. LPI represents a shape as
Signed Distance Functions (SDFs) using surface codes. Each surface code is a
latent code representing a part whose center is on the surface, which enables
us to flexibly employ intrinsic attributes of shapes or additional surface
properties. Eventually, LPI can reconstruct both the shape and the parts on the
shape, both of which are plausible meshes. LPI is a multi-level representation,
which can partition a shape into different numbers of parts after training. LPI
can be learned without ground truth signed distances, point normals or any
supervision for part partition. LPI outperforms the latest methods under the
widely used benchmarks in terms of reconstruction accuracy and modeling
interpretability. Our code, data and models are available at
https://github.com/chenchao15/LPI.Comment: 20pages,14figures. Accepted by ECCV 202
Thermal tunability in terahertz metamaterials fabricated on strontium titanate single crystal substrates
We report an experimental demonstration of thermal tuning of resonance
frequency in a planar terahertz metamaterial consisting of a gold split-ring
resonator array fabricated on a bulk single crystal strontium titanate (SrTiO3)
substrate. Cooling the metamaterial starting from 409 K down to 150 K causes
about 50% shift in resonance frequency as compare to its room temperature
resonance, and there is very little variation in resonance strength. The
resonance shift is due to the temperature-dependent refractive index (or the
dielectric constant) of the strontium titanate. The experiment opens up avenues
for designing tunable terahertz devices by exploiting the temperature sensitive
characteristic of high dielectric constant substrates and complex metal oxide
materials.Comment: 6 pages, 3 figures, accepted at Optics Letter
On Fast-Converged Deep Reinforcement Learning for Optimal Dispatch of Large-Scale Power Systems under Transient Security Constraints
Power system optimal dispatch with transient security constraints is commonly
represented as Transient Security-Constrained Optimal Power Flow (TSC-OPF).
Deep Reinforcement Learning (DRL)-based TSC-OPF trains efficient
decision-making agents that are adaptable to various scenarios and provide
solution results quickly. However, due to the high dimensionality of the state
space and action spaces, as well as the non-smoothness of dynamic constraints,
existing DRL-based TSC-OPF solution methods face a significant challenge of the
sparse reward problem. To address this issue, a fast-converged DRL method for
TSC-OPF is proposed in this paper. The Markov Decision Process (MDP) modeling
of TSC-OPF is improved by reducing the observation space and smoothing the
reward design, thus facilitating agent training. An improved Deep Deterministic
Policy Gradient algorithm with Curriculum learning, Parallel exploration, and
Ensemble decision-making (DDPG-CPEn) is introduced to drastically enhance the
efficiency of agent training and the accuracy of decision-making. The
effectiveness, efficiency, and accuracy of the proposed method are demonstrated
through experiments in the IEEE 39-bus system and a practical 710-bus regional
power grid. The source code of the proposed method is made public on GitHub.Comment: 10 pages, 11 figure
- …