123 research outputs found

    Reasoning with Latent Diffusion in Offline Reinforcement Learning

    Full text link
    Offline reinforcement learning (RL) holds promise as a means to learn high-reward policies from a static dataset, without the need for further environment interactions. However, a key challenge in offline RL lies in effectively stitching portions of suboptimal trajectories from the static dataset while avoiding extrapolation errors arising due to a lack of support in the dataset. Existing approaches use conservative methods that are tricky to tune and struggle with multi-modal data (as we show) or rely on noisy Monte Carlo return-to-go samples for reward conditioning. In this work, we propose a novel approach that leverages the expressiveness of latent diffusion to model in-support trajectory sequences as compressed latent skills. This facilitates learning a Q-function while avoiding extrapolation error via batch-constraining. The latent space is also expressive and gracefully copes with multi-modal data. We show that the learned temporally-abstract latent space encodes richer task-specific information for offline RL tasks as compared to raw state-actions. This improves credit assignment and facilitates faster reward propagation during Q-learning. Our method demonstrates state-of-the-art performance on the D4RL benchmarks, particularly excelling in long-horizon, sparse-reward tasks

    Effects of ring-strain on the ultrafast photochemistry of cyclic ketones

    Get PDF
    Ring-strain in cyclic organic molecules is well-known to influence their chemical reactivity. Here, we examine the consequence of ring-strain for competing photochemical pathways that occur on picosecond timescales. The significance of Norrish Type-I photochemistry is explored for three cyclic ketones in cyclohexane solutions at ultraviolet (UV) excitation wavelengths from 255–312 nm, corresponding to an π* ← n excitation to the lowest excited singlet state (S(1)). Ultrafast transient absorption spectroscopy with broadband UV/visible probe laser pulses reveals processes common to cyclobutanone, cyclopentanone and cyclohexanone, occurring on timescales of ≤1 ps, 7–9 ps and >500 ps. These kinetic components are respectively assigned to prompt cleavage of an α C–C bond in the internally excited S(1)-state molecules prepared by UV absorption, vibrational cooling of these hot-S(1) molecules to energies below the barrier to C–C bond cleavage on the S(1) state potential energy surface (with commensurate reductions in the energy-dependent α-cleavage rate), and slower loss of thermalized S(1)-state population. The thermalized S(1)-state molecules may competitively decay by activated reaction over the barrier to α C–C bond fission on the S(1)-state potential energy surface, internal conversion to the ground (S(0)) electronic state, or intersystem crossing to the lowest lying triplet state (T(1)) and subsequent C–C bond breaking. The α C–C bond fission barrier height in the S(1) state is significantly reduced by the ring-strain in cyclobutanone, affecting the relative contributions of the three decay time components which depend systematically on the excitation energy above the S(1)-state energy barrier. Transient infra-red absorption spectra obtained after UV excitation identify ring-opened ketene photoproducts of cyclobutanone and their timescales for formation

    Machine Learning Based Path Planning for Improved Rover Navigation (Pre-Print Version)

    Get PDF
    Enhanced AutoNav (ENav), the baseline surface navigation software for NASA's Perseverance rover, sorts a list of candidate paths for the rover to traverse, then uses the Approximate Clearance Evaluation (ACE) algorithm to evaluate whether the most highly ranked paths are safe. ACE is crucial for maintaining the safety of the rover, but is computationally expensive. If the most promising candidates in the list of paths are all found to be infeasible, ENav must continue to search the list and run time-consuming ACE evaluations until a feasible path is found. In this paper, we present two heuristics that, given a terrain heightmap around the rover, produce cost estimates that more effectively rank the candidate paths before ACE evaluation. The first heuristic uses Sobel operators and convolution to incorporate the cost of traversing high-gradient terrain. The second heuristic uses a machine learning (ML) model to predict areas that will be deemed untraversable by ACE. We used physics simulations to collect training data for the ML model and to run Monte Carlo trials to quantify navigation performance across a variety of terrains with various slopes and rock distributions. Compared to ENav's baseline performance, integrating the heuristics can lead to a significant reduction in ACE evaluations and average computation time per planning cycle, increase path efficiency, and maintain or improve the rate of successful traverses. This strategy of targeting specific bottlenecks with ML while maintaining the original ACE safety checks provides an example of how ML can be infused into planetary science missions and other safety-critical software

    The first interferometric detections of Fast Radio Bursts

    Get PDF
    We present the first interferometric detections of Fast Radio Bursts (FRBs), an enigmatic new class of astrophysical transient. In a 180-day survey of the Southern sky we discovered 3 FRBs at 843 MHz with the UTMOST array, as part of commissioning science during a major ongoing upgrade. The wide field of view of UTMOST (9\approx 9 deg2^{2}) is well suited to FRB searches. The primary beam is covered by 352 partially overlapping fan-beams, each of which is searched for FRBs in real time with pulse widths in the range 0.655 to 42 ms, and dispersion measures \leq2000 pc cm3^{-3}. Detections of FRBs with the UTMOST array places a lower limit on their distances of 104\approx 10^4 km (limit of the telescope near-field) supporting the case for an astronomical origin. Repeating FRBs at UTMOST or an FRB detected simultaneously with the Parkes radio telescope and UTMOST, would allow a few arcsec localisation, thereby providing an excellent means of identifying FRB host galaxies, if present. Up to 100 hours of follow-up for each FRB has been carried out with the UTMOST, with no repeating bursts seen. From the detected position, we present 3σ\sigma error ellipses of 15 arcsec x 8.4 deg on the sky for the point of origin for the FRBs. We estimate an all-sky FRB rate at 843 MHz above a fluence Flim\cal F_\mathrm{lim} of 11 Jy ms of 78\sim 78 events sky1^{-1} d1^{-1} at the 95 percent confidence level. The measured rate of FRBs at 843 MHz is of order two times higher than we had expected, scaling from the FRB rate at the Parkes radio telescope, assuming that FRBs have a flat spectral index and a uniform distribution in Euclidean space. We examine how this can be explained by FRBs having a steeper spectral index and/or a flatter logNN-logF\mathcal{F} distribution than expected for a Euclidean Universe.Comment: 13 pages, 8 figures, 2 table

    The UTMOST: A hybrid digital signal processor transforms the MOST

    Get PDF
    The Molonglo Observatory Synthesis Telescope (MOST) is an 18,000 square meter radio telescope situated some 40 km from the city of Canberra, Australia. Its operating band (820-850 MHz) is now partly allocated to mobile phone communications, making radio astronomy challenging. We describe how the deployment of new digital receivers (RX boxes), Field Programmable Gate Array (FPGA) based filterbanks and server-class computers equipped with 43 GPUs (Graphics Processing Units) has transformed MOST into a versatile new instrument (the UTMOST) for studying the dynamic radio sky on millisecond timescales, ideal for work on pulsars and Fast Radio Bursts (FRBs). The filterbanks, servers and their high-speed, low-latency network form part of a hybrid solution to the observatory's signal processing requirements. The emphasis on software and commodity off-the-shelf hardware has enabled rapid deployment through the re-use of proven 'software backends' for its signal processing. The new receivers have ten times the bandwidth of the original MOST and double the sampling of the line feed, which doubles the field of view. The UTMOST can simultaneously excise interference, make maps, coherently dedisperse pulsars, and perform real-time searches of coherent fan beams for dispersed single pulses. Although system performance is still sub-optimal, a pulsar timing and FRB search programme has commenced and the first UTMOST maps have been made. The telescope operates as a robotic facility, deciding how to efficiently target pulsars and how long to stay on source, via feedback from real-time pulsar folding. The regular timing of over 300 pulsars has resulted in the discovery of 7 pulsar glitches and 3 FRBs. The UTMOST demonstrates that if sufficient signal processing can be applied to the voltage streams it is possible to perform innovative radio science in hostile radio frequency environments.Comment: 12 pages, 6 figure

    Detection of a glitch in the pulsar J1709-4429

    Get PDF
    We report the detection of a glitch event in the pulsar J1709-4429 (also known as B1706-44) during regular monitoring observations with the Molonglo Observatory Synthesis Telescope (UTMOST). The glitch was found during timing operations, in which we regularly observe over 400 pulsars with up to daily cadence, while commensally searching for Rotating Radio Transients, pulsars, and FRBs. With a fractional size of Δν/ν52.4×109\Delta\nu/\nu \approx 52.4 \times10^{-9}, the glitch reported here is by far the smallest known for this pulsar, attesting to the efficacy of glitch searches with high cadence using UTMOST.Comment: 3 pages, 1 figur
    corecore