4,298 research outputs found

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems

    The value of D-dimer in the detection of early deep-vein thrombosis after total knee arthroplasty in Asian patients: a cohort study

    Get PDF
    <p>Abstract</p> <p>Background and purpose</p> <p>The relationship of D-dimer and deep-vein thrombosis (DVT) after total knee arthroplasty (TKA) remains controversial. The purpose of this study was to assess the value of D-dimer in the detection of early DVT after TKA.</p> <p>Methods</p> <p>The measurements of plasma D-dimer level were obtained preoperatively and at day 7 postoperatively in 78 patients undergoing TKA. Ascending venography was performed in 7 to 10 days after surgery. The plasma D-dimer levels were correlated statistically with the venographic DVT.</p> <p>Results</p> <p>Venographic DVT was identified in 40% of patients. High plasma D-dimer level >2.0 Ī¼g/ml was found in 68% of patients with DVT and 45% without DVT (P < 0.05). Therefore, high D-dimer level greater than 2.0 Ī¼g/ml showed 68% sensitivity, 55% specificity, 60% accuracy, 50% positive predictive rate and 72% negative predictive rate in the detection of early DVT after TKA.</p> <p>Conclusion</p> <p>High plasma D-dimer level is a moderately sensitive, but less specific marker in the detection of early of DVT after TKA. Measurement of serum D-dimer alone is not accurate enough to detect DVT after TKA. Venography is recommended in patients with elevated D-dimer and clinically suspected but asymptomatic DVT after TKA.</p

    THE MEASUREMENT OF (NUCLEAR G-FACTOR OF UNIPOSITIVE SODIUM-23 ION)/(ELECTRONIC G-FACTOR OF SODIUM-23) BY OPTICAL PUMPING

    Get PDF

    Single-Mode Projection Filters for Modal Parameter Identification for Flexible Structures

    Get PDF
    Single-mode projection filters are developed for eigensystem parameter identification from both analytical results and test data. Explicit formulations of these projection filters are derived using the orthogonal matrices of the controllability and observability matrices in the general sense. A global minimum optimization algorithm is applied to update the filter parameters by using the interval analysis method. The updated modal parameters represent the characteristics of the test data. For illustration of this new approach, a numerical simulation for the MAST beam structure is shown by using a one-dimensional global optimization algorithm to identify modal frequencies and damping. Another numerical simulation of a ten-mode structure is also presented by using a two-dimensional global optimization algorithm to illustrate the feasibility of the new method. The projection filters are practical for parallel processing implementation

    Projection filters for modal parameter estimate for flexible structures

    Get PDF
    Single-mode projection filters are developed for eigensystem parameter estimates from both analytical results and test data. Explicit formulations of these projection filters are derived using the pseudoinverse matrices of the controllability and observability matrices in general use. A global minimum optimization algorithm is developed to update the filter parameters by using interval analysis method. Modal parameters can be attracted and updated in the global sense within a specific region by passing the experimental data through the projection filters. For illustration of this method, a numerical example is shown by using a one-dimensional global optimization algorithm to estimate model frequencies and dampings

    Beef import market shares in Taiwan: implications for Australia

    Get PDF
    Market shares of major beef suppliers to Taiwan, including Australia, the United States and New Zealand, were estimated econometrically to determine their relative competitiveness. The analysis, based on monthly data from June 1990 to August 1997, showed that relative prices and consumer incomes were important factors influencing suppliersā€™ market shares. Specifically, the demand for Australian beef responded little to an increase in price and negatively to an increase in consumer income. Furthermore, the growth in Taiwan beef consumption has slowed down and Australian beef suppliers need to reā€assess the market potential and develop appropriate marketing strategies to maintain competitiveness.International Relations/Trade, Livestock Production/Industries,

    Informative Path Planning for Active Field Mapping under Localization Uncertainty

    Full text link
    Information gathering algorithms play a key role in unlocking the potential of robots for efficient data collection in a wide range of applications. However, most existing strategies neglect the fundamental problem of the robot pose uncertainty, which is an implicit requirement for creating robust, high-quality maps. To address this issue, we introduce an informative planning framework for active mapping that explicitly accounts for the pose uncertainty in both the mapping and planning tasks. Our strategy exploits a Gaussian Process (GP) model to capture a target environmental field given the uncertainty on its inputs. For planning, we formulate a new utility function that couples the localization and field mapping objectives in GP-based mapping scenarios in a principled way, without relying on any manually tuned parameters. Extensive simulations show that our approach outperforms existing strategies, with reductions in mean pose uncertainty and map error. We also present a proof of concept in an indoor temperature mapping scenario.Comment: 8 pages, 7 figures, submission (revised) to Robotics & Automation Letters (and IEEE International Conference on Robotics and Automation
    • ā€¦
    corecore