11,803 research outputs found

    Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning

    Full text link
    Recent advances in combining deep neural network architectures with reinforcement learning techniques have shown promising potential results in solving complex control problems with high dimensional state and action spaces. Inspired by these successes, in this paper, we build two kinds of reinforcement learning algorithms: deep policy-gradient and value-function based agents which can predict the best possible traffic signal for a traffic intersection. At each time step, these adaptive traffic light control agents receive a snapshot of the current state of a graphical traffic simulator and produce control signals. The policy-gradient based agent maps its observation directly to the control signal, however the value-function based agent first estimates values for all legal control signals. The agent then selects the optimal control action with the highest value. Our methods show promising results in a traffic network simulated in the SUMO traffic simulator, without suffering from instability issues during the training process

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    A Scalable Model of Cerebellar Adaptive Timing and Sequencing: The Recurrent Slide and Latch (RSL) Model

    Full text link
    From the dawn of modern neural network theory, the mammalian cerebellum has been a favored object of mathematical modeling studies. Early studies focused on the fan-out, convergence, thresholding, and learned weighting of perceptual-motor signals within the cerebellar cortex. This led in the proposals of Albus (1971; 1975) and Marr (1969) to the still viable idea that the granule cell stage in the cerebellar cortex performs a sparse expansive recoding of the time-varying input vector. This recoding reveals and emphasizes combinations (of input state variables) in a distributed representation that serves as a basis for the learned, state-dependent control actions engendered by cerebellar outputs to movement related centers. Although well-grounded as such, this perspective seriously underestimates the intelligence of the cerebellar cortex. Context and state information arises asynchronously due to the heterogeneity of sources that contribute signals to compose the cerebellar input vector. These sources include radically different sensory systems - vision, kinesthesia, touch, balance and audition - as well as many stages of the motor output channel. To make optimal use of available signals, the cerebellum must be able to sift the evolving state representation for the most reliable predictors of the need for control actions, and to use those predictors even if they appear only transiently and well in advance of the optimal time for initiating the control action. Such a cerebellar adaptive timing competence has recently been experimentally verified (Perrett, Ruiz, & Mauk, 1993). This paper proposes a modification to prior, population, models for cerebellar adaptive timing and sequencing. Since it replaces a population with a single clement, the proposed Recurrent Slide and Latch (RSL) model is in one sense maximally efficient, and therefore optimal from the perspective of scalability.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-92-J-1309, N00014-93-1-1364, N00014-95-1-0409)

    Epidemiological Prediction using Deep Learning

    Get PDF
    Department of Mathematical SciencesAccurate and real-time epidemic disease prediction plays a significant role in the health system and is of great importance for policy making, vaccine distribution and disease control. From the SIR model by Mckendrick and Kermack in the early 1900s, researchers have developed a various mathematical model to forecast the spread of disease. With all attempt, however, the epidemic prediction has always been an ongoing scientific issue due to the limitation that the current model lacks flexibility or shows poor performance. Owing to the temporal and spatial aspect of epidemiological data, the problem fits into the category of time-series forecasting. To capture both aspects of the data, this paper proposes a combination of recent Deep Leaning models and applies the model to ILI (influenza like illness) data in the United States. Specifically, the graph convolutional network (GCN) model is used to capture the geographical feature of the U.S. regions and the gated recurrent unit (GRU) model is used to capture the temporal dynamics of ILI. The result was compared with the Deep Learning model proposed by other researchers, demonstrating the proposed model outperforms the previous methods.clos
    corecore