453 research outputs found
Warped Tori with Almost Non-Negative Scalar Curvature
For sequences of warped product metrics on a -torus satisfying the scalar
curvature bound , uniform upper volume and diameter
bounds, and a uniform lower area bound on the smallest minimal surface, we find
a subsequence which converges in both the Gromov-Hausdorff and the
Sormani-Wenger Intrinsic Flat sense to a flat -torus.Comment: 21 pages. The second version has no changes to the estimates, just a
change in title and some exposition in response to a request by a senior
mathematician. Minor revisions made suggested by the referee in version
three. To appear in Geometriae Dedicat
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
Reinforcement learning is a framework for interactive decision-making with
incentives sequentially revealed across time without a system dynamics model.
Due to its scaling to continuous spaces, we focus on policy search where one
iteratively improves a parameterized policy with stochastic policy gradient
(PG) updates. In tabular Markov Decision Problems (MDPs), under persistent
exploration and suitable parameterization, global optimality may be obtained.
By contrast, in continuous space, the non-convexity poses a pathological
challenge as evidenced by existing convergence results being mostly limited to
stationarity or arbitrary local extrema. To close this gap, we step towards
persistent exploration in continuous space through policy parameterizations
defined by distributions of heavier tails defined by tail-index parameter
alpha, which increases the likelihood of jumping in state space. Doing so
invalidates smoothness conditions of the score function common to PG. Thus, we
establish how the convergence rate to stationarity depends on the policy's tail
index alpha, a Holder continuity parameter, integrability conditions, and an
exploration tolerance parameter introduced here for the first time. Further, we
characterize the dependence of the set of local maxima on the tail index
through an exit and transition time analysis of a suitably defined Markov
chain, identifying that policies associated with Levy Processes of a heavier
tail converge to wider peaks. This phenomenon yields improved stability to
perturbations in supervised learning, which we corroborate also manifests in
improved performance of policy search, especially when myopic and farsighted
incentives are misaligned
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning
We present a novel unified bilevel optimization-based framework,
\textsf{PARL}, formulated to address the recently highlighted critical issue of
policy alignment in reinforcement learning using utility or preference-based
feedback. We identify a major gap within current algorithmic designs for
solving policy alignment due to a lack of precise characterization of the
dependence of the alignment objective on the data generated by policy
trajectories. This shortfall contributes to the sub-optimal performance
observed in contemporary algorithms. Our framework addressed these concerns by
explicitly parameterizing the distribution of the upper alignment objective
(reward design) by the lower optimal variable (optimal policy for the designed
reward). Interestingly, from an optimization perspective, our formulation leads
to a new class of stochastic bilevel problems where the stochasticity at the
upper objective depends upon the lower-level variable. To demonstrate the
efficacy of our formulation in resolving alignment issues in RL, we devised an
algorithm named \textsf{A-PARL} to solve PARL problem, establishing sample
complexity bounds of order . Our empirical results
substantiate that the proposed \textsf{PARL} can address the alignment concerns
in RL by showing significant improvements (up to 63\% in terms of required
samples) for policy alignment in large-scale environments of the Deepmind
control suite and Meta world tasks
NASA Polynomial representation of molecular specific heats
So called NASA polynomials are widely used in plasma and combustion models to represent the specific heat of molecules as a function of temperature. In this work, we compute seven-term NASA polynomials for 464 molecules of which 44 are cations and 9 are anions; polynomials are not currently available for almost 200 of these species. Calculation of the NASA polynomials utilises data provided by the ExoMol database, the HITRAN database, the diatomic partition functions computed by Barklem and Collet, and the JANAF thermodynamic tables. Our results are compared against existing polynomial compilations where available, and for cases where there are multiple datasets the recommended polynomials are identified. As proposed in the original compilation, the seven-term polynomials are fitted separately for the temperature ranges 200 – 1000 K and 1000 – 6000 K. In general, different data sources give good agreement in the lower temperature range but there are significant discrepancies at higher temperatures, which can be attributed to the underlying assumptions made about highly excited rotation-vibration energy levels
Developing anti-GDF6 therapeutics for treatment of advanced melanoma
Melanoma, the leading cause of skin cancer death in the U.S., is increasing in incidence. Targeted therapies have been approved for treatment of advanced melanoma, but few patients experience extended survival benefit. In order to combat poor outcomes, new therapeutic targets are needed. Using cross-species oncogenomic analyses, our lab has identified a novel melanoma driver, Growth differentiation factor 6 (GDF6), a secreted bone morphogenetic protein (BMP) ligand that is amplified and overexpressed in human melanomas. Functional analyses show GDF6 acts via the BMP-SMAD1 pathway as a pro-survival factor in melanomas. Inhibiting GDF6 or the BMP pathway using shRNAs or the small molecule inhibitor, DMH1, induces melanoma cell death thereby abrogating melanoma growth in mouse xenografts. These results suggest GDF6 is an optimal target melanoma therapy. In order to better understand the dynamics of GDF6 signaling in melanoma cells, we are currently investigating the effect of exogenous GDF6 on cells with inhibited GDF6 expression to determine the required concentration to activate SMAD1 signaling and rescue viability. As GDF6 is a secreted ligand, we proposed developing antibodies to block the GDF6 interaction at its receptor, thereby inhibiting signaling. In collaboration with MassBiologics, we have generated a panel of monoclonal antibodies targeting GDF6. To identify antibodies capable of blocking GDF6 activity, we have devised a series of assays to eliminate antibodies from the panel. First, candidates are screened for affinity to GDF6. Second, candidates are screened for ability to block interaction between GDF6 and its receptor. Third, candidates are evaluated for ability to inhibit downstream signaling via SMAD1 pathway. After selection of final candidates, we will use a xenograft model to determine ability to inhibit melanoma growth in vivo. Currently, we have identified antibodies that are able to recognize GDF6 via western blot, and are proceeding to screen these antibodies for anti-GDF6 activity
- …