1,134 research outputs found

    High energy gravitational scattering: a numerical study

    Get PDF
    The S-matrix in gravitational high energy scattering is computed from the region of large impact parameters b down to the regime where classical gravitational collapse is expected to occur. By solving the equation of an effective action introduced by Amati, Ciafaloni and Veneziano we find that the perturbative expansion around the leading eikonal result diverges at a critical value signalling the onset of a new regime. We then discuss the main features of our explicitly unitary S-matrix down to the Schwarzschild's radius R=2G s^(1/2), where it diverges at a critical value b ~ 2.22 R of the impact parameter. The nature of the singularity is studied with particular attention to the scaling behaviour of various observables at the transition. The numerical approach is validated by reproducing the known exact solution in the axially symmetric case to high accuracy.Comment: 11 pages, 6 figure

    Enhancing Exploration and Safety in Deep Reinforcement Learning

    Get PDF
    A Deep Reinforcement Learning (DRL) agent tries to learn a policy maximizing a long-term objective by trials and errors in large state spaces. However, this learning paradigm requires a non-trivial amount of interactions in the environment to achieve good performance. Moreover, critical applications, such as robotics, typically involve safety criteria to consider while designing novel DRL solutions. Hence, devising safe learning approaches with efficient exploration is crucial to avoid getting stuck in local optima, failing to learn properly, or causing damages to the surrounding environment. This thesis focuses on developing Deep Reinforcement Learning algorithms to foster efficient exploration and safer behaviors in simulation and real domains of interest, ranging from robotics to multi-agent systems. To this end, we rely both on standard benchmarks, such as SafetyGym, and robotic tasks widely adopted in the literature (e.g., manipulation, navigation). This variety of problems is crucial to assess the statistical significance of our empirical studies and the generalization skills of our approaches. We initially benchmark the sample efficiency versus performance trade-off between value-based and policy-gradient algorithms. This part highlights the benefits of using non-standard simulation environments (i.e., Unity), which also facilitates the development of further optimization for DRL. We also discuss the limitations of standard evaluation metrics (e.g., return) in characterizing the actual behaviors of a policy, proposing the use of Formal Verification (FV) as a practical methodology to evaluate behaviors over desired specifications. The second part introduces Evolutionary Algorithms (EAs) as a gradient-free complimentary optimization strategy. In detail, we combine population-based and gradient-based DRL to diversify exploration and improve performance both in single and multi-agent applications. For the latter, we discuss how prior Multi-Agent (Deep) Reinforcement Learning (MARL) approaches hinder exploration, proposing an architecture that favors cooperation without affecting exploration

    Online Safety Property Collection and Refinement for Safe Deep Reinforcement Learning in Mapless Navigation

    Full text link
    Safety is essential for deploying Deep Reinforcement Learning (DRL) algorithms in real-world scenarios. Recently, verification approaches have been proposed to allow quantifying the number of violations of a DRL policy over input-output relationships, called properties. However, such properties are hard-coded and require task-level knowledge, making their application intractable in challenging safety-critical tasks. To this end, we introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time. CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties. Hence, we propose a refinement strategy to combine properties that model similar unsafe interactions. Our evaluation compares the benefits of computing the number of violations using standard hard-coded properties and the ones generated with CROP. We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches.Comment: Accepted at the 2023 IEEE International Conference on Robotics and Automation (ICRA). Marzari and Marchesini contributed equall

    Partially Observable Monte Carlo Planning with state variable constraints for mobile robot navigation

    Get PDF
    Autonomous mobile robots employed in industrial applications often operate in complex and uncertain environments. In this paper we propose an approach based on an extension of Partially Observable Monte Carlo Planning (POMCP) for robot velocity regulation in industrial-like environments characterized by uncertain motion difficulties. The velocity selected by POMCP is used by a standard engine controller which deals with path planning. This two-layer approach allows POMCP to exploit prior knowledge on the relationships between task similarities to improve performance in terms of time spent to traverse a path with obstacles. We also propose three measures to support human-understanding of the strategy used by POMCP to improve the performance. The overall architecture is tested on a Turtlebot3 in two environments, a rectangular path and a realistic production line in a research lab. Tests performed on a C++ simulator confirm the capability of the proposed approach to profitably use prior knowledge, achieving a performance improvement from 0.7% to 3.1% depending on the complexity of the path. Experiments on a Unity simulator show that the proposed two-layer approach outperforms also single-layer approaches based only on the engine controller (i.e., without the POMCP layer). In this case the performance improvement is up to 37% comparing to a state-of-the-art deep reinforcement learning engine controller, and up to 51% comparing to the standard ROS engine controller. Finally, experiments in a real-world testing arena confirm the possibility to run the approach on real robots

    Enumerating Safe Regions in Deep Neural Networks with Provable Probabilistic Guarantees

    Full text link
    Identifying safe areas is a key point to guarantee trust for systems that are based on Deep Neural Networks (DNNs). To this end, we introduce the AllDNN-Verification problem: given a safety property and a DNN, enumerate the set of all the regions of the property input domain which are safe, i.e., where the property does hold. Due to the #P-hardness of the problem, we propose an efficient approximation method called epsilon-ProVe. Our approach exploits a controllable underestimation of the output reachable sets obtained via statistical prediction of tolerance limits, and can provide a tight (with provable probabilistic guarantees) lower estimate of the safe areas. Our empirical evaluation on different standard benchmarks shows the scalability and effectiveness of our method, offering valuable insights for this new type of verification of DNNs.Comment: Accepted at the 38th Annual AAAI Conference on Artificial Intelligence 202

    Toward a unified TreeTalker data curation process

    Get PDF
    The Internet of Things (IoT) development is revolutionizing environmental monitoring and research in macroecology. This technology allows for the deployment of sizeable diffuse sensing networks capable of continuous monitoring. Because of this property, the data collected from IoT networks can provide a testbed for scientific hypotheses across large spatial and temporal scales. Nevertheless, data curation is a necessary step to make large and heterogeneous datasets exploitable for synthesis analyses. This process includes data retrieval, quality assurance, standardized formatting, storage, and documentation. TreeTalkers are an excellent example of IoT applied to ecology. These are smart devices for synchronously measuring trees’ physiological and environmental parameters. A set of devices can be organized in a mesh and permit data collection from a single tree to plot or transect scale. The deployment of such devices over large-scale networks needs a standardized approach for data curation. For this reason, we developed a unified processing workflow according to the user manual. In this paper, we first introduce the concept of a unified TreeTalker data curation process. The idea was formalized into an R-package, and it is freely available as open software. Secondly, we present the different functions available in “ttalkR”, and, lastly, we illustrate the application with a demonstration dataset. With such a unified processing approach, we propose a necessary data curation step to establish a new environmental cyberinfrastructure and allow for synthesis activities across environmental monitoring networks. Our data curation concept is the first step for supporting the TreeTalker data life cycle by improving accessibility and thus creating unprecedented opportunities for TreeTalker-based macroecological analyse

    Learning State-Variable Relationships in POMCP: A Framework for Mobile Robots

    Get PDF
    We address the problem of learning relationships on state variables in Partially Observable Markov Decision Processes (POMDPs) to improve planning performance. Specifically, we focus on Partially Observable Monte Carlo Planning (POMCP) and represent the acquired knowledge with a Markov Random Field (MRF). We propose, in particular, a method for learning these relationships on a robot as POMCP is used to plan future actions. Then, we present an algorithm that deals with cases in which the MRF is used on episodes having unlikely states with respect to the equality relationships represented by the MRF. Our approach acquires information from the agent’s action outcomes to adapt online the MRF if a mismatch is detected between the MRF and the true state. We test this technique on two domains, rocksample, a standard rover exploration task, and a problem of velocity regulation in industrial mobile robotic platforms, showing that the MRF adaptation algorithm improves the planning performance with respect to the standard approach, which does not adapt the MRF online. Finally, a ROS-based architecture is proposed, which allows running the MRF learning, the MRF adaptation, and MRF usage in POMCP on real robotic platforms. In this case, we successfully tested the architecture on a Gazebo simulator of rocksample. A video of the experiments is available in the Supplementary Material, and the code of the ROS-based architecture is available online

    Weight Loss Expectations in Obese Patients and Treatment Attrition: An Observational Multicenter Study

    Get PDF
    Objective: To investigate the influence of weight loss expectations (expected 1-year BMI loss, dream and maximum acceptable BMI) on attrition in obese patients seeking treatment. Research Methods and Procedures: Obese subjects (1785; 1393 women; median age, 46 years; median BMI, 36.7 kg/m2) seeking treatment in 23 medical Italian centers were evaluated. Baseline diet and weight history, weight loss expectations, and primary motivation for seeking treatment (health or improving appearance) were systematically recorded. Psychiatric distress, binge eating, and body image dissatisfaction were tested at baseline by self-administered questionnaires (Symptom Check List-90, Binge Eating Scale, and Body Uneasiness Test). Attrition and BMI change at 12 months were prospectively recorded. Results: At 12 months, 923 of 1785 patients (51.7%) had discontinued treatment. Compared with continuers, dropouts had a significantly lower age, a lower age at first dieting, lower dream BMI, a higher expected 1-year BMI loss, and a higher weight phobia. At logistic regression analysis, the strongest predictors of attrition at 12 months were lower age and higher expected 1-year BMI loss. The risk of drop-out increased systematically for unit increase in expected BMI loss at 12 months (hazard ratio, 1.12; 95% confidence interval, 1.04 to 1.20; p 0.0018). The risk was particularly elevated in the first 6 months. Discussion: Baseline weight loss expectations are independent cognitive predictors of attrition in obese patients entering a weight-losing program; the higher the expectations, the higher attrition at 12 months. Unrealistic weight goals should be tackled at the very beginning of treatment

    Assessment of chicken breast shelf life based on bench-top and portable near-infrared spectroscopy tools coupled with chemometrics

    Get PDF
    Abstract Objectives Near-infrared (NIR) spectroscopy is a rapid technique able to assess meat quality even if its capability to determine the shelf life of chicken fresh cuts is still debated, especially for portable devices. The aim of the study was to compare bench-top and portable NIR instruments in discriminating between four chicken breast refrigeration times (RT), coupled with multivariate classifier models. Materials and Methods Ninety-six samples were analysed by both NIR tools at 2, 6, 10 and 14 days post mortem. NIR data were subsequently submitted to partial least squares discriminant analysis (PLS-DA) and canonical discriminant analysis (CDA). The latter was preceded by double feature selection based on Boruta and Stepwise procedures. Results PLS-DA sorted moderate separation of RT theses, while shelf life assessment was more accurate on application of Stepwise-CDA. Bench-top tool had better performance than portable one, probably because it captured more informative spectral data as shown by the variable importance in projection (VIP) and restricted pool of Stepwise-CDA predictive scores (SPS). Conclusions NIR tools coupled with a multivariate model provide deep insight into the physicochemical processes occurring during storage. Spectroscopy showed reliable effectiveness to recognise a 7-day shelf life threshold of breasts, suitable for routine at-line application for screening of meat quality
    • …
    corecore