5 research outputs found

    Reinforcement Learning through Supervision for Autonomous Agents

    Get PDF
    Abstract Reinforcement Learning (RL) is a class of model-free learning control methods that can solve Markov Decision Process (MDP) problems. However, one difficulty for the application of RL control is its slow convergence, especially in MDPs with continuous state space. In this paper, a modified structure of RL is proposed to accelerate reinforcement learning control. This approach combines supervision technique with the standard Qlearning algorithm of reinforcement learning. The a priori information is provided to the RL learning agent by a direct integration of a human operator commands (a.k.a. human advices) or by an optimal LQ-controller, indicating preferred actions in some particular situations. It is shown that the convergence speed of the supervised RL agent is greatly improved compared to the conventional Q-Learning algorithm. Simulation work and results on the cart-pole balancing problem and learning navigation tasks in unknown grid world with obstacles are given to illustrate the efficiency of the proposed method

    Phenotypic characterization and growth performance of sheep populations in Northeastern Algeria

    Get PDF
    The presente study aims to characterize the sheep population phenotypically in Northeastern Algeria, to determine the growth dynamic and performance of pre-weaned lambs of this area and to assess the altitude effect on the growth performance. For this purpose, three experiments were performed. The first experiment was carried out in eight farms, in which a total of 160 ewes were subject to 4 quantitative body measurements (body length, withers height, tail length and wool weight) and 9 qualitative traits (head length, ear orientation, horn presence, neck length, wool extent, tail texture, eyes shape, head color and wool color). The results indicate that the sheep population in these areas belonged to the Ouled Djellal sheep breed with some atypic phenotypes, which refers to uncontrolled crossbreeding. Body measurements show moderate variation coefficient values. Clear correlations were revealed between withers height and tail length (0.30) and body length and wool weight (0.26). The Hierarchical Agglomerative Clustering analysis sub-divided the sheep population into three classes. The second experiment was conducted on two institutional farms, ITDAS and ITELV, in which 50 adult animals were the subject of 15 body measurements: withers height, back high, rump high, chest depth, head length, head width, ear length, ear width, chest width, rump width, body length, heart girth, cannon circumference, tail length and tail circumference. The results reveal that the studied sheep of the Ouled Djellal breed is a morphologically homogeneous population with superiority, in almost body measurements, for males. The correlation coefficients show that body weight can be estimated by heart girth alone or combined with other body measurements such as tail circumference, withers height, head length and cannon circumference. The third experiment evaluated the growth dynamics and the effect of altitude on different growth phases of pre-weaned Ouled Djellal lambs. Seven sites were chosen for this study. A total of 49 lambs were weighed. Their average daily gain was calculated at different ages (birth, D 30, 60, 90 and 120). The results show that maximum growth occurs during the lamb’s first month of life at 200 g/day. Highly positive and significant correlations were recorded between D 90 and D 120 (0.94), D 60 and D 120 (0.88), D 60 and D 90 (0.87) and D 30 and D 90 (0.77). Concerning the altitude effect, lambs born in low altitude regions show a better growth performance than lambs born in high altitude regions which confirms that the Ouled Djellal breed is a typical breed of the steppe and the high plain

    Modélisation et Analyse des comportements d’une société d’agents mobiles à l’aide des réseaux de Petri

    No full text
    This thesis presents theoretical and implemented design and validation of search techniques, done by robot society, in pursue of evaders in various environments. The environment was topologically modeled by graphs of connected areas. The results of the proposed techniques, and the optimization of needed number for clearing operation, were validated using Petri nets. For the robot navigation problem between different areas, navigation techniques based on potential field propagation were anticipated to generate trajectories, safe in one hand (distant from obstacles) and fast in the other hand, to reach target. The validation of the proposed techniques was done using simulation platform MobotSim on designed and real environments in one hand, and a real hardware platform composed of two POB-Bot robots and adapted environments in the other hand.Cette thèse présente l’élaboration et la validation théorique et pratique des techniques de recherche, menés par des robots organisés en société, à la poursuite d’évadés dans des environnements variés. Une modélisation topologique de l’environnement en graphes de zones connexes a été réalisée. Les résultats des techniques proposées, et l’optimisation du nombre de robots mobiles nécessaire pour la réussite de l’opération de décontamination, ont été validés par l’utilisation les réseaux de Petri. Pour le problème de navigation du robot entre les zones, des techniques basées sur la propagation des potentiels ont été proposés pour générer des trajectoires à la fois sûres (éloignées des obstacles) et rapides pour atteindre la cible. La validation des techniques proposées a été réalisée en simulation sur la plate-forme logicielle MobotSim, sur des environnements conçus et réels d’une part, et en pratique sur la plate-forme matérielle composée de deux robots POB-Bot et d’environnements adaptés d’autre part.Mots-clés : société de robots, poursuite-évasion, robot mobile, navigation, réseau de Petri, théorie des graphe

    Modelling Microcystis Cell Density in a Mediterranean Shallow Lake of Northeast Algeria (Oubeira Lake), Using Evolutionary and Classic Programming

    No full text
    Caused by excess levels of nutrients and increased temperatures, freshwater cyanobacterial blooms have become a serious global issue. However, with the development of artificial intelligence and extreme learning machine methods, the forecasting of cyanobacteria blooms has become more feasible. We explored the use of multiple techniques, including both statistical [Multiple Regression Model (MLR) and Support Vector Machine (SVM)] and evolutionary [Particle Swarm Optimization (PSO), Genetic Algorithm (GA), and Bird Swarm Algorithm (BSA)], to approximate models for the prediction of Microcystis density. The data set was collected from Oubeira Lake, a natural shallow Mediterranean lake in the northeast of Algeria. From the correlation analysis of ten water variables monitored, six potential factors including temperature, ammonium, nitrate, and ortho-phosphate were selected. The performance indices showed; MLR and PSO provided the best results. PSO gave the best fitness but all techniques performed well. BSA had better fitness but was very slow across generations. PSO was faster than the other techniques and at generation 20 it passed BSA. GA passed BSA a little further, at generation 50. The major contributions of our work not only focus on the modelling process itself, but also take into consideration the main factors affecting Microcystis blooms, by incorporating them in all applied models
    corecore