4,444 research outputs found
Constructing a price deflator for R&D: calculating the price of knowledge investments as a residual
Working Pape
Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates
Recently, data augmentation (DA) has emerged as a method for leveraging
domain knowledge to inexpensively generate additional data in reinforcement
learning (RL) tasks, often yielding substantial improvements in data
efficiency. While prior work has demonstrated the utility of incorporating
augmented data directly into model-free RL updates, it is not well-understood
when a particular DA strategy will improve data efficiency. In this paper, we
seek to identify general aspects of DA responsible for observed learning
improvements. Our study focuses on sparse-reward tasks with dynamics-invariant
data augmentation functions, serving as an initial step towards a more general
understanding of DA and its integration into RL training. Experimentally, we
isolate three relevant aspects of DA: state-action coverage, reward density,
and the number of augmented transitions generated per update (the augmented
replay ratio). From our experiments, we draw two conclusions: (1) increasing
state-action coverage often has a much greater impact on data efficiency than
increasing reward density, and (2) decreasing the augmented replay ratio
substantially improves data efficiency. In fact, certain tasks in our empirical
study are solvable only when the replay ratio is sufficiently low
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling
On-policy reinforcement learning (RL) algorithms perform policy updates using
i.i.d. trajectories collected by the current policy. However, after observing
only a finite number of trajectories, on-policy sampling may produce data that
fails to match the expected on-policy data distribution. This sampling error
leads to noisy updates and data inefficient on-policy learning. Recent work in
the policy evaluation setting has shown that non-i.i.d., off-policy sampling
can produce data with lower sampling error than on-policy sampling can produce.
Motivated by this observation, we introduce an adaptive, off-policy sampling
method to improve the data efficiency of on-policy policy gradient algorithms.
Our method, Proximal Robust On-Policy Sampling (PROPS), reduces sampling error
by collecting data with a behavior policy that increases the probability of
sampling actions that are under-sampled with respect to the current policy.
Rather than discarding data from old policies -- as is commonly done in
on-policy algorithms -- PROPS uses data collection to adjust the distribution
of previously collected data to be approximately on-policy. We empirically
evaluate PROPS on both continuous-action MuJoCo benchmark tasks as well as
discrete-action tasks and demonstrate that (1) PROPS decreases sampling error
throughout training and (2) improves the data efficiency of on-policy policy
gradient algorithms. Our work improves the RL community's understanding of a
nuance in the on-policy vs off-policy dichotomy: on-policy learning requires
on-policy data, not on-policy sampling
A Comparison of Some Methods of Deriving the Instantaneous Unit Hydrograph
The geomorphological instantaneous unit hydrograph (IUH) proposed by Gupta et al. (1980) was compared with the IUH derived by commonly used time-area and Nash methods. This comparison was performed by analyzing the effective rainfall-direct runoff relationship for four large basins in Central Italy ranging in area from 934 to 4,147 km2.
The Nash method was found to be the most accurate of the three methods. The geomorphological method, with only one parameter estimated in advance from the observed data, was found to be little less accurate than the Nash method which has two parameters determined from observations. Furthermore, if the geomorphological and Nash methods employed the same information represented by basin lag, then they produced similar accuracy provided the other Nash parameter, expressed by the product of peak flow and time to peak, was empirically assessed within a wide range of values. It was concluded that it was more appropriate to use the geomorphological method for ungaged basins and the Nash method for gaged basins
Enhancing sustainability by improving plant salt tolerance through macro-and micro-algal biostimulants
Algal biomass, extracts, or derivatives have long been considered a valuable material to bring benefits to humans and cultivated plants. In the last decades, it became evident that algal formulations can induce multiple effects on crops (including an increase in biomass, yield, and quality), and that algal extracts contain a series of bioactive compounds and signaling molecules, in addition to mineral and organic nutrients. The need to reduce the non-renewable chemical input in agriculture has recently prompted an increase in the use of algal extracts as a plant biostimulant, also because of their ability to promote plant growth in suboptimal conditions such as saline environments is beneficial. In this article, we discuss some research areas that are critical for the implementation in agriculture of macro-and microalgae extracts as plant biostimulants. Specifically, we provide an overview of current knowledge and achievements about extraction methods, compositions, and action mechanisms of algal extracts, focusing on salt-stress tolerance. We also outline current limitations and possible research avenues. We conclude that the comparison and the integration of knowledge on the molecular and physiological response of plants to salt and to algal extracts should also guide the extraction procedures and application methods. The effects of algal biostimulants have been mainly investigated from an applied perspective, and the exploitation of different scientific disciplines is still much needed for the development of new sustainable strategies to increase crop tolerance to salt stress
Finite Fracture Mechanics extension to dynamic loading scenarios
The coupled criterion of Finite Fracture Mechanics (FFM) has already been successfully applied to assess the brittle failure initiation in cracked and notched structures subjected to quasi-static loading conditions. The FFM originality lies in addressing failure onset through the simultaneous fulfilment of a stress requirement and the energy balance, both computed over a finite distance ahead of the stress raiser. Accordingly, this length results to be a structural parameter, thus able to interact with the geometry under investigation. This work aims at extending the FFM failure criterion to dynamic loadings. To this end, the general requisites of a proper dynamic failure criterion are first shortlisted. The novel Dynamic extension of FFM (DFFM) is then put forward assuming the existence of a material time interval that is related to the coalescence period of microcracks upon macroscopic failure. On this basis, the DFFM model is investigated in case a one-to-one relation between the external solicitation and both the dynamic stress field and energy release rate holds true. Under such a condition, the DFFM is also validated against suitable experimental data on rock materials from the literature and proven to properly catch the increase of the failure load as the loading rate rises, thus proving to be a novel technique suitable for modelling the rate dependence of failure initiation in brittle and quasi-brittle materials
- …