11,803 research outputs found
Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning
Recent advances in combining deep neural network architectures with
reinforcement learning techniques have shown promising potential results in
solving complex control problems with high dimensional state and action spaces.
Inspired by these successes, in this paper, we build two kinds of reinforcement
learning algorithms: deep policy-gradient and value-function based agents which
can predict the best possible traffic signal for a traffic intersection. At
each time step, these adaptive traffic light control agents receive a snapshot
of the current state of a graphical traffic simulator and produce control
signals. The policy-gradient based agent maps its observation directly to the
control signal, however the value-function based agent first estimates values
for all legal control signals. The agent then selects the optimal control
action with the highest value. Our methods show promising results in a traffic
network simulated in the SUMO traffic simulator, without suffering from
instability issues during the training process
Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks
Future wireless networks have a substantial potential in terms of supporting
a broad range of complex compelling applications both in military and civilian
fields, where the users are able to enjoy high-rate, low-latency, low-cost and
reliable information services. Achieving this ambitious goal requires new radio
techniques for adaptive learning and intelligent decision making because of the
complex heterogeneous nature of the network structures and wireless services.
Machine learning (ML) algorithms have great success in supporting big data
analytics, efficient parameter estimation and interactive decision making.
Hence, in this article, we review the thirty-year history of ML by elaborating
on supervised learning, unsupervised learning, reinforcement learning and deep
learning. Furthermore, we investigate their employment in the compelling
applications of wireless networks, including heterogeneous networks (HetNets),
cognitive radios (CR), Internet of things (IoT), machine to machine networks
(M2M), and so on. This article aims for assisting the readers in clarifying the
motivation and methodology of the various ML algorithms, so as to invoke them
for hitherto unexplored services as well as scenarios of future wireless
networks.Comment: 46 pages, 22 fig
A Scalable Model of Cerebellar Adaptive Timing and Sequencing: The Recurrent Slide and Latch (RSL) Model
From the dawn of modern neural network theory, the mammalian cerebellum has been a favored object of mathematical modeling studies. Early studies focused on the fan-out, convergence, thresholding, and learned weighting of perceptual-motor signals within the cerebellar cortex. This led in the proposals of Albus (1971; 1975) and Marr (1969) to the still viable idea that the granule cell stage in the cerebellar cortex performs a sparse expansive recoding of the time-varying input vector. This recoding reveals and emphasizes combinations (of input state variables) in a distributed representation that serves as a basis for the learned, state-dependent control actions engendered by cerebellar outputs to movement related centers. Although well-grounded as such, this perspective seriously underestimates the intelligence of the cerebellar cortex. Context and state information arises asynchronously due to the heterogeneity of sources that contribute signals to compose the cerebellar input vector. These sources include radically different sensory systems - vision, kinesthesia, touch, balance and audition - as well as many stages of the motor output channel. To make optimal use of available signals, the cerebellum must be able to sift the evolving state representation for the most reliable predictors of the need for control actions, and to use those predictors even if they appear only transiently and well in advance of the optimal time for initiating the control action. Such a cerebellar adaptive timing competence has recently been experimentally verified (Perrett, Ruiz, & Mauk, 1993). This paper proposes a modification to prior, population, models for cerebellar adaptive timing and sequencing. Since it replaces a population with a single clement, the proposed Recurrent Slide and Latch (RSL) model is in one sense maximally efficient, and therefore optimal from the perspective of scalability.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-92-J-1309, N00014-93-1-1364, N00014-95-1-0409)
Epidemiological Prediction using Deep Learning
Department of Mathematical SciencesAccurate and real-time epidemic disease prediction plays a significant role in the health system and is of great importance for policy making, vaccine distribution and disease control. From the SIR model by Mckendrick and Kermack in the early 1900s, researchers have developed a various mathematical model to forecast the spread of disease. With all attempt, however, the epidemic prediction has always been an ongoing scientific issue due to the limitation that the current model lacks flexibility or shows poor performance. Owing to the temporal and spatial aspect of epidemiological data, the problem fits into the category of time-series forecasting. To capture both aspects of the data, this paper proposes a combination of recent Deep Leaning
models and applies the model to ILI (influenza like illness) data in the United States. Specifically, the graph convolutional network (GCN) model is used to capture the geographical feature of the U.S. regions and the gated recurrent unit (GRU) model is used to capture the temporal dynamics of ILI. The result was compared with the Deep Learning model proposed by other researchers, demonstrating the proposed model outperforms the previous methods.clos
- …