7,994 research outputs found

    Learning to automatically detect features for mobile robots using second-order Hidden Markov Models

    Get PDF
    In this paper, we propose a new method based on Hidden Markov Models to interpret temporal sequences of sensor data from mobile robots to automatically detect features. Hidden Markov Models have been used for a long time in pattern recognition, especially in speech recognition. Their main advantages over other methods (such as neural networks) are their ability to model noisy temporal signals of variable length. We show in this paper that this approach is well suited for interpretation of temporal sequences of mobile-robot sensor data. We present two distinct experiments and results: the first one in an indoor environment where a mobile robot learns to detect features like open doors or T-intersections, the second one in an outdoor environment where a different mobile robot has to identify situations like climbing a hill or crossing a rock.Comment: 200

    Learning Dimensions: Lessons from Field Studies

    Get PDF
    In this paper, we describe work to investigate the creation of engaging programming learning experiences. Background research informed the design of four fieldwork studies involving a range of age groups to explore how programming tasks could best be framed to motivate learners. Our empirical findings from these four studies, described here, contributed to the design of a set of programming "Learning Dimensions" (LDs). The LDs provide educators with insights to support key design decisions for the creation of engaging programming learning experiences. This paper describes the background to the identification of these LDs and how they could address the design and delivery of highly engaging programming learning tasks. A web application has been authored to support educators in the application of the LDs to their lesson design

    Reinforcement Learning for Racecar Control

    Get PDF
    This thesis investigates the use of reinforcement learning to learn to drive a racecar in the simulated environment of the Robot Automobile Racing Simulator. Real-life race driving is known to be difficult for humans, and expert human drivers use complex sequences of actions. There are a large number of variables, some of which change stochastically and all of which may affect the outcome. This makes driving a promising domain for testing and developing Machine Learning techniques that have the potential to be robust enough to work in the real world. Therefore the principles of the algorithms from this work may be applicable to a range of problems. The investigation starts by finding a suitable data structure to represent the information learnt. This is tested using supervised learning. Reinforcement learning is added and roughly tuned, and the supervised learning is then removed. A simple tabular representation is found satisfactory, and this avoids difficulties with more complex methods and allows the investigation to concentrate on the essentials of learning. Various reward sources are tested and a combination of three are found to produce the best performance. Exploration of the problem space is investigated. Results show exploration is essential but controlling how much is done is also important. It turns out the learning episodes need to be very long and because of this the task needs to be treated as continuous by using discounting to limit the size of the variables stored. Eligibility traces are used with success to make the learning more efficient. The tabular representation is made more compact by hashing and more accurate by using smaller buckets. This slows the learning but produces better driving. The improvement given by a rough form of generalisation indicates the replacement of the tabular method by a function approximator is warranted. These results show reinforcement learning can work within the Robot Automobile Racing Simulator, and lay the foundations for building a more efficient and competitive agent

    Mobile Platform with Dynamic Optimization of the Pattern in Education in Colleges Through the Perspective of Network Informatization

    Get PDF
    The combination of mobile learning platforms and network informatization offers numerous benefits to learners, educators, and institutions. Learners can take control of their learning journey, accessing educational materials at their convenience and engaging in collaborative learning activities with peers from diverse backgrounds. This paper aims to explore the integration of mobile learning platforms and network informatization, examining their impact on educational practices, learner engagement, and the overall learning experience. The network informatization is assessed and monitored with Dynamic Programming Optimization (DPO) to compute the feature in reverse osmosis in English education. The attributes and features in the English language are computed and estimated for the periodic information update within the system. The DPO process is implemented along with the mandhani fuzzy set for the estimation of features in English education in colleges and universities. The information processed is updated in the mobile learning platform for the computation of the features in the English language and classification is performed with the deep learning model. Simulation analysis stated that constructed model is effective for the estimation and computation of the features and patterns in English language teaching in colleges and universities

    Exploring the Limitations of Behavior Cloning for Autonomous Driving

    Get PDF
    Driving requires reacting to a wide variety of complex environment conditions and agent behaviors. Explicitly modeling each possible scenario is unrealistic. In contrast, imitation learning can, in theory, leverage data from large fleets of human-driven cars. Behavior cloning in particular has been successfully used to learn simple visuomotor policies end-to-end, but scaling to the full spectrum of driving behaviors remains an unsolved problem. In this paper, we propose a new benchmark to experimentally investigate the scalability and limitations of behavior cloning. We show that behavior cloning leads to state-of-the-art results, including in unseen environments, executing complex lateral and longitudinal maneuvers without these reactions being explicitly programmed. However, we confirm well-known limitations (due to dataset bias and overfitting), new generalization issues (due to dynamic objects and the lack of a causal model), and training instability requiring further research before behavior cloning can graduate to real-world driving. The code of the studied behavior cloning approaches can be found at https://github.com/felipecode/coiltraine

    Exploring Deep Reinforcement Learning Techniques for Autonomous Navigation

    Get PDF
    This paper contains research into efficient autonomous navigation algorithms powered by deep reinforcement learning. These algorithms enable a mobile robot to perform waypoint tracking in an indoor environment. The robot does not contain a map of the environment nor does it know the location of the waypoints. A reward function is used to encourage behaviors in the robot that lead it closer to the goal. This is an active area of research encouraged by recent advancements in neural networks applied to sequential decision making. The reinforcement learning algorithms utilize LiDAR and IMU sensors in order to navigate the unknown environment by calculating the robot’s current state and what its next action should be. At each step the action that is most likely to yield the maximum reward is sent to the robot in order to follow the sequential targets along the path to the final goal location. I use a low-fidelity custom simulator based on the Dubins Path along with a high-fidelity 3D simulator, Gazebo, to train various policies. The Dubins simulator is constructed from Python and executes very fast, while Gazebo requires more resources but is very advanced. After training is complete, ROS is used to deploy the RL policy onto the physical robot and convert the action commands into linear and angular velocities that can be understood by the robot’s hardware/motors. The TurtleBot3 Burger is the robot being used for evaluation in the real world. Often times, there is a severe drop in performance between the simulator and the real world so this is also monitored and factored into performance. Dense and sparse reward functions are explored in order to mimic various real-world scenarios where the reward is not always known at every step. Finally, Deep Q-Learning, Trust Region Policy Optimization, and a new RL algorithm called Learning Online with Guidance Offline are implemented and tested throughout the course of the research
    corecore