5,727 research outputs found

    Sample Efficient Policy Search for Optimal Stopping Domains

    Full text link
    Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return. We examine the problem of simultaneously learning and planning in such domains, when data is collected directly from the environment. We propose GFSE, a simple and flexible model-free policy search method that reuses data for sample efficiency by leveraging problem structure. We bound the sample complexity of our approach to guarantee uniform convergence of policy value estimates, tightening existing PAC bounds to achieve logarithmic dependence on horizon length for our setting. We also examine the benefit of our method against prevalent model-based and model-free approaches on 3 domains taken from diverse fields.Comment: To appear in IJCAI-201

    Hybridation of Bayesian networks and evolutionary algorithms for multi-objective optimization in an integrated product design and project management context

    Get PDF
    A better integration of preliminary product design and project management processes at early steps of system design is nowadays a key industrial issue. Therefore, the aim is to make firms evolve from classical sequential approach (first product design the project design and management) to new integrated approaches. In this paper, a model for integrated product/project optimization is first proposed which allows taking into account simultaneously decisions coming from the product and project managers. However, the resulting model has an important underlying complexity, and a multi-objective optimization technique is required to provide managers with appropriate scenarios in a reasonable amount of time. The proposed approach is based on an original evolutionary algorithm called evolutionary algorithm oriented by knowledge (EAOK). This algorithm is based on the interaction between an adapted evolutionary algorithm and a model of knowledge (MoK) used for giving relevant orientations during the search process. The evolutionary operators of the EA are modified in order to take into account these orientations. The MoK is based on the Bayesian Network formalism and is built both from expert knowledge and from individuals generated by the EA. A learning process permits to update probabilities of the BN from a set of selected individuals. At each cycle of the EA, probabilities contained into the MoK are used to give some bias to the new evolutionary operators. This method ensures both a faster and effective optimization, but it also provides the decision maker with a graphic and interactive model of knowledge linked to the studied project. An experimental platform has been developed to experiment the algorithm and a large campaign of tests permits to compare different strategies as well as the benefits of this novel approach in comparison with a classical EA

    Global Optimization for Future Gravitational Wave Detectors' Sites

    Get PDF
    We consider the optimal site selection of future generations of gravitational wave detectors. Previously, Raffai et al. optimized a 2-detector network with a combined figure of merit. This optimization was extended to networks with more than two detectors in a limited way by first fixing the parameters of all other component detectors. In this work we now present a more general optimization that allows the locations of all detectors to be simultaneously chosen. We follow the definition of Raffai et al. on the metric that defines the suitability of a certain detector network. Given the locations of the component detectors in the network, we compute a measure of the network's ability to distinguish the polarization, constrain the sky localization and reconstruct the parameters of a gravitational wave source. We further define the `flexibility index' for a possible site location, by counting the number of multi-detector networks with a sufficiently high Figure of Merit that include that site location. We confirm the conclusion of Raffai et al., that in terms of flexibility index as defined in this work, Australia hosts the best candidate site to build a future generation gravitational wave detector. This conclusion is valid for either a 3-detector network or a 5-detector network. For a 3-detector network site locations in Northern Europe display a comparable flexibility index to sites in Australia. However for a 5-detector network, Australia is found to be a clearly better candidate than any other location.Comment: 30 pages, 23 figures, 2 table

    Non-linear carbon dioxide determination using infrared gas sensors and neural networks with Bayesian regularization

    Get PDF
    Carbon dioxide gas concentration determination using infrared gas sensors combined with Bayesian regularizing neural networks is presented in this work. Infrared sensor with a measuring range of 0~5% was used to measure carbon dioxide gas concentration within the range 0~15000 ppm. Neural networks were employed to fulfill the nonlinear output of the sensor. The Bayesian strategy was used to regularize the training of the back propagation neural network with a Levenberg-Marquardt (LM) algorithm. By Bayesian regularization (BR), the design of the network was adaptively achieved according to the complexity of the application. Levenberg-Marquardt algorithm under Bayesian regularization has better generalization capability, and is more stable than the classical method. The results showed that the Bayesian regulating neural network was a powerful tool for dealing with the infrared gas sensor which has a large non-linear measuring range and provide precise determination of carbon dioxide gas concentration. In this example, the optimal architecture of the network was one neuron in the input and output layer and two neurons in the hidden layer. The network model gave a relationship coefficient of 0.9996 between targets and outputs. The prediction recoveries were within 99.9~100.0%
    corecore