5,176 research outputs found

    Perseus: Randomized Point-based Value Iteration for POMDPs

    Full text link
    Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

    Pricing and Resource Allocation via Game Theory for a Small-Cell Video Caching System

    Full text link
    Evidence indicates that downloading on-demand videos accounts for a dramatic increase in data traffic over cellular networks. Caching popular videos in the storage of small-cell base stations (SBS), namely, small-cell caching, is an efficient technology for reducing the transmission latency whilst mitigating the redundant transmissions of popular videos over back-haul channels. In this paper, we consider a commercialized small-cell caching system consisting of a network service provider (NSP), several video retailers (VR), and mobile users (MU). The NSP leases its SBSs to the VRs for the purpose of making profits, and the VRs, after storing popular videos in the rented SBSs, can provide faster local video transmissions to the MUs, thereby gaining more profits. We conceive this system within the framework of Stackelberg game by treating the SBSs as a specific type of resources. We first model the MUs and SBSs as two independent Poisson point processes, and develop, via stochastic geometry theory, the probability of the specific event that an MU obtains the video of its choice directly from the memory of an SBS. Then, based on the probability derived, we formulate a Stackelberg game to jointly maximize the average profit of both the NSP and the VRs. Also, we investigate the Stackelberg equilibrium by solving a non-convex optimization problem. With the aid of this game theoretic framework, we shed light on the relationship between four important factors: the optimal pricing of leasing an SBS, the SBSs allocation among the VRs, the storage size of the SBSs, and the popularity distribution of the VRs. Monte-Carlo simulations show that our stochastic geometry-based analytical results closely match the empirical ones. Numerical results are also provided for quantifying the proposed game-theoretic framework by showing its efficiency on pricing and resource allocation.Comment: Accepted to appear in IEEE Journal on Selected Areas in Communications, special issue on Video Distribution over Future Interne

    Challenges Using the Linux Network Stack for Real-Time Communication

    Get PDF
    Starting in the early 2000s, human-in-the-loop (HITL) simulation groups at NASA and the Air Force Research Lab began using the Linux network stack for some real-time communication. More recently, SpaceX has adopted Ethernet as the primary bus technology for its Falcon launch vehicles and Dragon capsules. As the Linux network stack makes its way from ground facilities to flight critical systems, it is necessary to recognize that the network stack is optimized for communication over the open Internet, which cannot provide latency guarantees. The Internet protocols and their implementation in the Linux network stack contain numerous design decisions that favor throughput over determinism and latency. These decisions often require workarounds in the application or customization of the stack to maintain a high probability of low latency on closed networks, especially if the network must be fault tolerant to single event upsets
    corecore