61 research outputs found
Hybridizing Constraint Programming and Monte-Carlo Tree Search: Application to the Job Shop problem
International audienceConstraint Programming (CP) solvers classically explore the solution space using tree search-based heuristics. Monte-Carlo Tree-Search (MCTS), a tree-search based method aimed at sequential decision making under uncertainty, simultaneously estimates the reward associated to the sub-trees, and gradually biases the exploration toward the most promising regions. This paper examines the tight combination of MCTS and CP on the job shop problem (JSP). The contribution is twofold. Firstly, a reward function compliant with the CP setting is proposed. Secondly, a biased MCTS node-selection rule based on this reward is proposed, that is suitable in a multiple-restarts context. Its integration within the Gecode constraint solver is shown to compete with JSP-specific CP approaches on difficult JSP instances
Parameter estimation of the kinetic α-Pinene isomerization model using the MCSfilter algorithm
This paper aims to illustrate the application of a derivative-free multistart algorithm with coordinate search filter, designated as the MCSFilter algorithm. The problem used in this study is the parameter estimation problem of the kinetic α -pinene isomerization model. This is a well known nonlinear optimization problem (NLP) that has been investigated as a case study for performance testing of most derivative based methods proposed in the literature. Since the MCSFilter algorithm features a stochastic component, it was run ten times to solve the NLP problem. The optimization problem was successfully solved in all the runs and the optimal solution demonstrates that the MCSFilter provides a good quality solution.(undefined)info:eu-repo/semantics/publishedVersio
Warm-Start AlphaZero Self-Play Search Enhancements
Recently, AlphaZero has achieved landmark results in deep reinforcement
learning, by providing a single self-play architecture that learned three
different games at super human level. AlphaZero is a large and complicated
system with many parameters, and success requires much compute power and
fine-tuning. Reproducing results in other games is a challenge, and many
researchers are looking for ways to improve results while reducing
computational demands. AlphaZero's design is purely based on self-play and
makes no use of labeled expert data ordomain specific enhancements; it is
designed to learn from scratch. We propose a novel approach to deal with this
cold-start problem by employing simple search enhancements at the beginning
phase of self-play training, namely Rollout, Rapid Action Value Estimate (RAVE)
and dynamically weighted combinations of these with the neural network, and
Rolling Horizon Evolutionary Algorithms (RHEA). Our experiments indicate that
most of these enhancements improve the performance of their baseline player in
three different (small) board games, with especially RAVE based variants
playing strongly
Constraint-Based Modeling and Kinetic Analysis of the Smad Dependent TGF-β Signaling Pathway
Background
Investigation of dynamics and regulation of the TGF-β signaling pathway is central to the understanding of complex cellular processes such as growth, apoptosis, and differentiation. In this study, we aim at using systems biology approach to provide dynamic analysis on this pathway.
Methodology/Principal Findings
We proposed a constraint-based modeling method to build a comprehensive mathematical model for the Smad dependent TGF-β signaling pathway by fitting the experimental data and incorporating the qualitative constraints from the experimental analysis. The performance of the model generated by constraint-based modeling method is significantly improved compared to the model obtained by only fitting the quantitative data. The model agrees well with the experimental analysis of TGF-β pathway, such as the time course of nuclear phosphorylated Smad, the subcellular location of Smad and signal response of Smad phosphorylation to different doses of TGF-β.
Conclusions/Significance
The simulation results indicate that the signal response to TGF-β is regulated by the balance between clathrin dependent endocytosis and non-clathrin mediated endocytosis. This model is useful to be built upon as new precise experimental data are emerging. The constraint-based modeling method can also be applied to quantitative modeling of other signaling pathways
- …