197 research outputs found

    Data-Driven Integral Reinforcement Learning for Continuous-Time Non-Zero-Sum Games

    Get PDF
    This paper develops an integral value iteration (VI) method to efficiently find online the Nash equilibrium solution of two-player non-zero-sum (NZS) differential games for linear systems with partially unknown dynamics. To guarantee the closed-loop stability about the Nash equilibrium, the explicit upper bound for the discounted factor is given. To show the efficacy of the presented online model-free solution, the integral VI method is compared with the model-based off-line policy iteration method. Moreover, the theoretical analysis of the integral VI algorithm in terms of three aspects, i.e., positive definiteness properties of the updated cost functions, the stability of the closed-loop systems, and the conditions that guarantee the monotone convergence, is provided in detail. Finally, the simulation results demonstrate the efficacy of the presented algorithms

    Adaptive traffic signal control using approximate dynamic programming

    Get PDF
    This thesis presents a study on an adaptive traffic signal controller for real-time operation. An approximate dynamic programming (ADP) algorithm is developed for controlling traffic signals at isolated intersection and in distributed traffic networks. This approach is derived from the premise that classic dynamic programming is computationally difficult to solve, and approximation is the second-best option for establishing sequential decision-making for complex process. The proposed ADP algorithm substantially reduces computational burden by using a linear approximation function to replace the exact value function of dynamic programming solution. Machine-learning techniques are used to improve the approximation progressively. Not knowing the ideal response for the approximation to learn from, we use the paradigm of unsupervised learning, and reinforcement learning in particular. Temporal-difference learning and perturbation learning are investigated as appropriate candidates in the family of unsupervised learning. We find in computer simulation that the proposed method achieves substantial reduction in vehicle delays in comparison with optimised fixed-time plans, and is competitive against other adaptive methods in computational efficiency and effectiveness in managing varying traffic. Our results show that substantial benefits can be gained by increasing the frequency at which the signal plans are revised. The proposed ADP algorithm is in compliance with a range of discrete systems of resolution from 0.5 to 5 seconds per temporal step. This study demonstrates the readiness of the proposed approach for real-time operations at isolated intersections and the potentials for distributed network control

    Fuzzy EOQ Model with Trapezoidal and Triangular Functions Using Partial Backorder

    Get PDF
    EOQ fuzzy model is EOQ model that can estimate the cost from existing information. Using trapezoid fuzzy functions can estimate the costs of existing and trapezoid membership functions has some points that have a value of membership . TR ̃C value results of trapezoid fuzzy will be higher than usual TRC value results of EOQ model . This paper aims to determine the optimal amount of inventory in the company, namely optimal Q and optimal V, using the model of partial backorder will be known optimal Q and V for the optimal number of units each time a message . EOQ model effect on inventory very closely by using EOQ fuzzy model with triangular and trapezoid membership functions with partial backorder. Optimal Q and optimal V values for the optimal fuzzy models will have an increase due to the use of trapezoid and triangular membership functions that have a different value depending on the requirements of each membership function value. Therefore, by using a fuzzy model can solve the company's problems in estimating the costs for the next term

    ACADEMIC HANDBOOK (UNDERGRADUATE) COLLEGE OF ENGINEERING (CoE)

    Get PDF

    Construction of a zero-coupon yield curve for the Nairobi Securities Exchange and its application in pricing derivatives

    Get PDF
    Thesis submitted in partial fulfillment of the requirements for the degree for PhD in Financial Mathematics at Strathmore UniversityYield curves are used to forecast interest rates for different products when their risk parameters are known, to calibrate no-arbitrage term structure models, and (mostly by investors) to detect whether there is arbitrage opportunity. By yield curve information, investors have opportunity of immunizing/hedging their investment portfolios against financial risks if they have to make an investment with some determined time of maturity. Private sector firms look at yields of different maturities and then choose their borrowing strategy. The differences in yields for long maturity and short maturities are an important indicator for central bank to use in monetary policy process. These differences may show the tightness of the government monetary policy and can be monitored to predict recession in coming years. A lot of research has been done in yield curve modeling and as we will see later in the thesis, most of the models developed had one major shortcoming: non differentiability at the interpolating knot points. The aim of this thesis is to construct a zero coupon yield curve for Nairobi Securities Exchange, and use the risk- free rates to price derivatives, with particular attention given to pricing coffee futures. This study looks into the three methods of constructing yield curves: by use of spline-based models, by interpolation and by using parametric models. We suggest an improvement in the interpolation methods used in the most celebrated spline-based model, monotonicity-preserving interpolation on r(t). We also use operator form of numerical differentiation to estimate the forward rates at the knot points, at which points the spot curve is non-differential. In derivative pricing, dynamical processes (Ito^ processes) are reviewed; and geometric Brownian motion is included, together with its properties and applications. Conventional techniques used in estimation of the drift and volatility parameters such as historical techniques are reviewed and discussed. We also use the Hough Transform, an artificial intelligence method, to detect market patterns and estimate the drift and volatility parameters simultaneously. We look at different ways of calculating derivative prices. For option pricing, we use different methods but apply Bellalahs models in calculation of the Coffee Futures prices because they incorporate an incomplete information parameter
    • …
    corecore