441 research outputs found

    Warped Tori with Almost Non-Negative Scalar Curvature

    Full text link
    For sequences of warped product metrics on a 33-torus satisfying the scalar curvature bound Rj≥−1jR_j \geq -\frac{1}{j}, uniform upper volume and diameter bounds, and a uniform lower area bound on the smallest minimal surface, we find a subsequence which converges in both the Gromov-Hausdorff and the Sormani-Wenger Intrinsic Flat sense to a flat 33-torus.Comment: 21 pages. The second version has no changes to the estimates, just a change in title and some exposition in response to a request by a senior mathematician. Minor revisions made suggested by the referee in version three. To appear in Geometriae Dedicat

    On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control

    Full text link
    Reinforcement learning is a framework for interactive decision-making with incentives sequentially revealed across time without a system dynamics model. Due to its scaling to continuous spaces, we focus on policy search where one iteratively improves a parameterized policy with stochastic policy gradient (PG) updates. In tabular Markov Decision Problems (MDPs), under persistent exploration and suitable parameterization, global optimality may be obtained. By contrast, in continuous space, the non-convexity poses a pathological challenge as evidenced by existing convergence results being mostly limited to stationarity or arbitrary local extrema. To close this gap, we step towards persistent exploration in continuous space through policy parameterizations defined by distributions of heavier tails defined by tail-index parameter alpha, which increases the likelihood of jumping in state space. Doing so invalidates smoothness conditions of the score function common to PG. Thus, we establish how the convergence rate to stationarity depends on the policy's tail index alpha, a Holder continuity parameter, integrability conditions, and an exploration tolerance parameter introduced here for the first time. Further, we characterize the dependence of the set of local maxima on the tail index through an exit and transition time analysis of a suitably defined Markov chain, identifying that policies associated with Levy Processes of a heavier tail converge to wider peaks. This phenomenon yields improved stability to perturbations in supervised learning, which we corroborate also manifests in improved performance of policy search, especially when myopic and farsighted incentives are misaligned

    PARL: A Unified Framework for Policy Alignment in Reinforcement Learning

    Full text link
    We present a novel unified bilevel optimization-based framework, \textsf{PARL}, formulated to address the recently highlighted critical issue of policy alignment in reinforcement learning using utility or preference-based feedback. We identify a major gap within current algorithmic designs for solving policy alignment due to a lack of precise characterization of the dependence of the alignment objective on the data generated by policy trajectories. This shortfall contributes to the sub-optimal performance observed in contemporary algorithms. Our framework addressed these concerns by explicitly parameterizing the distribution of the upper alignment objective (reward design) by the lower optimal variable (optimal policy for the designed reward). Interestingly, from an optimization perspective, our formulation leads to a new class of stochastic bilevel problems where the stochasticity at the upper objective depends upon the lower-level variable. To demonstrate the efficacy of our formulation in resolving alignment issues in RL, we devised an algorithm named \textsf{A-PARL} to solve PARL problem, establishing sample complexity bounds of order O(1/T)\mathcal{O}(1/T). Our empirical results substantiate that the proposed \textsf{PARL} can address the alignment concerns in RL by showing significant improvements (up to 63\% in terms of required samples) for policy alignment in large-scale environments of the Deepmind control suite and Meta world tasks

    NASA Polynomial representation of molecular specific heats

    Get PDF
    So called NASA polynomials are widely used in plasma and combustion models to represent the specific heat of molecules as a function of temperature. In this work, we compute seven-term NASA polynomials for 464 molecules of which 44 are cations and 9 are anions; polynomials are not currently available for almost 200 of these species. Calculation of the NASA polynomials utilises data provided by the ExoMol database, the HITRAN database, the diatomic partition functions computed by Barklem and Collet, and the JANAF thermodynamic tables. Our results are compared against existing polynomial compilations where available, and for cases where there are multiple datasets the recommended polynomials are identified. As proposed in the original compilation, the seven-term polynomials are fitted separately for the temperature ranges 200 – 1000 K and 1000 – 6000 K. In general, different data sources give good agreement in the lower temperature range but there are significant discrepancies at higher temperatures, which can be attributed to the underlying assumptions made about highly excited rotation-vibration energy levels

    Developing anti-GDF6 therapeutics for treatment of advanced melanoma

    Get PDF
    Melanoma, the leading cause of skin cancer death in the U.S., is increasing in incidence. Targeted therapies have been approved for treatment of advanced melanoma, but few patients experience extended survival benefit. In order to combat poor outcomes, new therapeutic targets are needed. Using cross-species oncogenomic analyses, our lab has identified a novel melanoma driver, Growth differentiation factor 6 (GDF6), a secreted bone morphogenetic protein (BMP) ligand that is amplified and overexpressed in human melanomas. Functional analyses show GDF6 acts via the BMP-SMAD1 pathway as a pro-survival factor in melanomas. Inhibiting GDF6 or the BMP pathway using shRNAs or the small molecule inhibitor, DMH1, induces melanoma cell death thereby abrogating melanoma growth in mouse xenografts. These results suggest GDF6 is an optimal target melanoma therapy. In order to better understand the dynamics of GDF6 signaling in melanoma cells, we are currently investigating the effect of exogenous GDF6 on cells with inhibited GDF6 expression to determine the required concentration to activate SMAD1 signaling and rescue viability. As GDF6 is a secreted ligand, we proposed developing antibodies to block the GDF6 interaction at its receptor, thereby inhibiting signaling. In collaboration with MassBiologics, we have generated a panel of monoclonal antibodies targeting GDF6. To identify antibodies capable of blocking GDF6 activity, we have devised a series of assays to eliminate antibodies from the panel. First, candidates are screened for affinity to GDF6. Second, candidates are screened for ability to block interaction between GDF6 and its receptor. Third, candidates are evaluated for ability to inhibit downstream signaling via SMAD1 pathway. After selection of final candidates, we will use a xenograft model to determine ability to inhibit melanoma growth in vivo. Currently, we have identified antibodies that are able to recognize GDF6 via western blot, and are proceeding to screen these antibodies for anti-GDF6 activity
    • …
    corecore