190,758 research outputs found

    Adaptive signal control using approximate dynamic programming

    Get PDF
    This paper presents a concise summary of a study on adaptive traffic signal controller for real time operation. The adaptive controller is designed to achieve three operational objectives: first, the controller adopts a dual control principle to achieve a balanced influence between immediate cost and long-term cost in operation; second, controller switches signals without referring to a preset plan and is acyclic; third, controller adjusts its parameters online to adapt new environment. Not all of these features are available in existing operational controllers. Although dynamic programming (DP) is the only exact solution for achieving the operational objectives, it is usually impractical for real time operation because of demand in computation and information. To circumvent the difficulties, we use approximate dynamic programming (ADP) in conjunction with online learning techniques. This approach can substantially reduce computational burden by replacing the exact value function of DP with a continuous linear approximation function, which is then updated progressively by online learning techniques. Two online learning techniques, which are reinforcement learning and monotonicity approximation respectively, are investigated. We find in computer simulation that the ADP controller leads to substantial savings in vehicle delays in comparison with optimised fixed-time plans. The implications of this study to traffic control are: the ADP controller meet all of the three operational objectives with competitive results, and can be readily implemented for operations at both isolated intersection and traffic networks; the ADP algorithm is computationally efficient, and the ADP controller is an evolving system that requires minimum human intervention; the ADP technique offers a flexible theoretical framework in which a range of functional forms and learning techniques can be further studied

    Cover Tree Bayesian Reinforcement Learning

    Get PDF
    This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the model with Thompson sampling and approximate dynamic programming to obtain effective exploration policies in unknown environments. The flexibility and computational simplicity of the model render it suitable for many reinforcement learning problems in continuous state spaces. We demonstrate this in an experimental comparison with least squares policy iteration

    PENGEMBANGAN MEDIA PEMBELAJARAN INTERAKTIF ONLINE MENGGUNAKAN EASYCLASS BERBANTUAN GEOGEBRA MATERI PROGRAM LINIER

    Get PDF
    This development research produces online interactive learning media using GeoGebra-assisted linear programming Easyclass. The purpose of this study is to describe the process and results of developing interactive online learning media using GeoGebra-assisted Easyclass linear program material. Research using the 4-D Thiagarajan model. The study was conducted at MAN 2 Jember, followed by 24 students of class XI IPA 4. The results showed a correlation value of 0.917 so that the media was declared valid with a very high category. Practicality was analyzed based on a student response questionnaire showing a percentage of 99% so that it was declared practical with excellent criteria. The effectiveness was analyzed based on the test scores of student learning outcomes after using instructional media obtained a percentage of 88% of the number of students who scored more than 75 which were categorized very well. Based on the results of data analysis, online interactive learning media using Easyclass-assisted GeoGebra linear program material meets the criteria of validity, practicality and effectiveness so that the media can be used in carrying out learning activities. Keywords: Development, Easyclass, GeoGebra, Linear Programmin

    Pengaruh Media Pembelajaran Daring terhadap Hasil Belajar PWPB Kelas XII RPL SMK N 1 Bukit Sundi

    Get PDF
    This study aims to determine the effect of online learning (X) on learning outcomes in web programming and mobile devices (Y) subjects at SMK N 1 Bukit Sundi. This study uses quantitative research. The data collection technique uses a questionnaire which is distributed to students. The population in this study were all class XII students majoring in RPL SMK N 1 Bukit Sundi totaling 40 students. The number of samples in this study were 40 students. Data analysis used simple linear regression analysis through calculations with the help of the IBM SPSS Statistics 26 program. The results showed that "The influence of online learning has a positive effect on learning outcomes of PWPB subjects at SMK N 1 Bukit Sundi". Linear Regression Equation The simple linear equation Y = 33.258 + 0.701X is obtained. From the results of the analysis, tcount is 4.284 > ttable value is 2.024 or Ho is rejected. Thus "Online Learning has a positive effect on Learning Outcomes of Class XII RPL PWPB subjects". This can be seen in the coefficient table of the tcount and the p-value is less than 0.05. The coefficient of determination is as big as in the table above, namely the value of R Square = 0.326 = 32.6%, this means that the variation in the learning outcomes variable is PWPB subject

    Approximate Dynamic Programming for Constrained Piecewise Affine Systems with Stability and Safety Guarantees

    Full text link
    Infinite-horizon optimal control of constrained piecewise affine (PWA) systems has been approximately addressed by hybrid model predictive control (MPC), which, however, has computational limitations, both in offline design and online implementation. In this paper, we consider an alternative approach based on approximate dynamic programming (ADP), an important class of methods in reinforcement learning. We accommodate non-convex union-of-polyhedra state constraints and linear input constraints into ADP by designing PWA penalty functions. PWA function approximation is used, which allows for a mixed-integer encoding to implement ADP. The main advantage of the proposed ADP method is its online computational efficiency. Particularly, we propose two control policies, which lead to solving a smaller-scale mixed-integer linear program than conventional hybrid MPC, or a single convex quadratic program, depending on whether the policy is implicitly determined online or explicitly computed offline. We characterize the stability and safety properties of the closed-loop systems, as well as the sub-optimality of the proposed policies, by quantifying the approximation errors of value functions and policies. We also develop an offline mixed-integer linear programming-based method to certify the reliability of the proposed method. Simulation results on an inverted pendulum with elastic walls and on an adaptive cruise control problem validate the control performance in terms of constraint satisfaction and CPU time

    OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling

    Full text link
    Online updating of time series forecasting models aims to address the concept drifting problem by efficiently updating forecasting models based on streaming data. Many algorithms are designed for online time series forecasting, with some exploiting cross-variable dependency while others assume independence among variables. Given every data assumption has its own pros and cons in online time series modeling, we propose \textbf{On}line \textbf{e}nsembling \textbf{Net}work (OneNet). It dynamically updates and combines two models, with one focusing on modeling the dependency across the time dimension and the other on cross-variate dependency. Our method incorporates a reinforcement learning-based approach into the traditional online convex programming framework, allowing for the linear combination of the two models with dynamically adjusted weights. OneNet addresses the main shortcoming of classical online learning methods that tend to be slow in adapting to the concept drift. Empirical results show that OneNet reduces online forecasting error by more than 50%\mathbf{50\%} compared to the State-Of-The-Art (SOTA) method. The code is available at \url{https://github.com/yfzhang114/OneNet}.Comment: 32 pages, 11 figures, 37th Conference on Neural Information Processing Systems (NeurIPS 2023
    • …
    corecore