27,717 research outputs found
Deep Reinforcement Learning for Supply Chain Synchronization
Supply chain synchronization can prevent the âbullwhip effectâ and significantly mitigate ripple effects caused by operational failures. This paper demonstrates how deep reinforcement learning agents based on the proximal policy optimization algorithm can synchronize inbound and outbound flows if end-toend visibility is provided. The paper concludes that the proposed solution has the potential to perform adaptive control in complex supply chains. Furthermore, the proposed approach is general, task unspecific, and adaptive in the sense that prior knowledge about the system is not required
An agent-based dynamic information network for supply chain management
One of the main research issues in supply chain management is to improve the global efficiency of supply chains.
However, the improvement efforts often fail because supply chains are complex, are subject to frequent changes, and collaboration and information sharing in the supply chains are often infeasible. This paper presents a practical
collaboration framework for supply chain management wherein multi-agent systems form dynamic information networks and coordinate their production and order planning according to synchronized estimation of market demands. In the framework, agents employ an iterative relaxation contract net protocol to find the most desirable
suppliers by using data envelopment analysis. Furthermore, the chain of buyers and suppliers, from the end markets to raw material suppliers, form dynamic information networks for synchronized planning. This paper presents an agent-based dynamic information network for supply chain management and discusses the associated
pros and cons
From supply chains to demand networks. Agents in retailing: the electrical bazaar
A paradigm shift is taking place in logistics. The focus is changing from operational effectiveness to adaptation. Supply Chains will develop into networks that will adapt to consumer demand in almost real time. Time to market, capacity of adaptation and enrichment of customer experience seem to be the key elements of this new paradigm. In this environment emerging technologies like RFID (Radio Frequency ID), Intelligent Products and the Internet, are triggering a reconsideration of methods, procedures and goals. We present a Multiagent System framework specialized in retail that addresses these changes with the use of rational agents and takes advantages of the new market opportunities. Like in an old bazaar, agents able to learn, cooperate, take advantage of gossip and distinguish between collaborators and competitors, have the ability to adapt, learn and react to a changing environment better than any other structure. Keywords: Supply Chains, Distributed Artificial Intelligence, Multiagent System.Postprint (published version
Lifting the Veil: Unlocking the Power of Depth in Q-learning
With the help of massive data and rich computational resources, deep
Q-learning has been widely used in operations research and management science
and has contributed to great success in numerous applications, including
recommender systems, supply chains, games, and robotic manipulation. However,
the success of deep Q-learning lacks solid theoretical verification and
interpretability. The aim of this paper is to theoretically verify the power of
depth in deep Q-learning. Within the framework of statistical learning theory,
we rigorously prove that deep Q-learning outperforms its traditional version by
demonstrating its good generalization error bound. Our results reveal that the
main reason for the success of deep Q-learning is the excellent performance of
deep neural networks (deep nets) in capturing the special properties of rewards
namely, spatial sparseness and piecewise constancy, rather than their large
capacities. In this paper, we make fundamental contributions to the field of
reinforcement learning by answering to the following three questions: Why does
deep Q-learning perform so well? When does deep Q-learning perform better than
traditional Q-learning? How many samples are required to achieve a specific
prediction accuracy for deep Q-learning? Our theoretical assertions are
verified by applying deep Q-learning in the well-known beer game in supply
chain management and a simulated recommender system
Biological learning and artificial intelligence
It was once taken for granted that learning in animals and man could be explained with a simple set of general learning rules, but over the last hundred years, a substantial amount of evidence has been accumulated that points in a quite different direction. In animal learning theory, the laws of learning are no longer considered general. Instead, it has been necessary to explain behaviour in terms of a large set of interacting learning mechanisms and innate behaviours. Artificial intelligence is now on the edge of making the transition from general theories to a view of intelligence that is based on anamalgamate of interacting systems. In the light of the evidence from animal learning theory, such a transition is to be highly desired
Online Learning for Offloading and Autoscaling in Energy Harvesting Mobile Edge Computing
Mobile edge computing (a.k.a. fog computing) has recently emerged to enable
in-situ processing of delay-sensitive applications at the edge of mobile
networks. Providing grid power supply in support of mobile edge computing,
however, is costly and even infeasible (in certain rugged or under-developed
areas), thus mandating on-site renewable energy as a major or even sole power
supply in increasingly many scenarios. Nonetheless, the high intermittency and
unpredictability of renewable energy make it very challenging to deliver a high
quality of service to users in energy harvesting mobile edge computing systems.
In this paper, we address the challenge of incorporating renewables into mobile
edge computing and propose an efficient reinforcement learning-based resource
management algorithm, which learns on-the-fly the optimal policy of dynamic
workload offloading (to the centralized cloud) and edge server provisioning to
minimize the long-term system cost (including both service delay and
operational cost). Our online learning algorithm uses a decomposition of the
(offline) value iteration and (online) reinforcement learning, thus achieving a
significant improvement of learning rate and run-time performance when compared
to standard reinforcement learning algorithms such as Q-learning. We prove the
convergence of the proposed algorithm and analytically show that the learned
policy has a simple monotone structure amenable to practical implementation.
Our simulation results validate the efficacy of our algorithm, which
significantly improves the edge computing performance compared to fixed or
myopic optimization schemes and conventional reinforcement learning algorithms.Comment: arXiv admin note: text overlap with arXiv:1701.01090 by other author
- âŠ