1,738 research outputs found

    Online Reinforcement Learning for Dynamic Multimedia Systems

    Full text link
    In our previous work, we proposed a systematic cross-layer framework for dynamic multimedia systems, which allows each layer to make autonomous and foresighted decisions that maximize the system's long-term performance, while meeting the application's real-time delay constraints. The proposed solution solved the cross-layer optimization offline, under the assumption that the multimedia system's probabilistic dynamics were known a priori. In practice, however, these dynamics are unknown a priori and therefore must be learned online. In this paper, we address this problem by allowing the multimedia system layers to learn, through repeated interactions with each other, to autonomously optimize the system's long-term performance at run-time. We propose two reinforcement learning algorithms for optimizing the system under different design constraints: the first algorithm solves the cross-layer optimization in a centralized manner, and the second solves it in a decentralized manner. We analyze both algorithms in terms of their required computation, memory, and inter-layer communication overheads. After noting that the proposed reinforcement learning algorithms learn too slowly, we introduce a complementary accelerated learning algorithm that exploits partial knowledge about the system's dynamics in order to dramatically improve the system's performance. In our experiments, we demonstrate that decentralized learning can perform as well as centralized learning, while enabling the layers to act autonomously. Additionally, we show that existing application-independent reinforcement learning algorithms, and existing myopic learning algorithms deployed in multimedia systems, perform significantly worse than our proposed application-aware and foresighted learning methods.Comment: 35 pages, 11 figures, 10 table

    minimizing estimation error variance using a weighted sum of samples from the soil moisture active passive (SMAP) satellite

    Full text link
    The National Aeronautics and Space Administration's (NASA) Soil Moisture Active Passive (SMAP) is the latest passive remote sensing satellite operating in the protected L-band spectrum from 1.400 to 1.427 GHz. SMAP provides global-scale soil moisture images with point-wise passive scanning of the earth's thermal radiations. SMAP takes multiple samples in frequency and time from each antenna footprint to increase the likelihood of capturing RFI-free samples. SMAP's current RFI detection and mitigation algorithm excludes samples detected to be RFI-contaminated and averages the remaining samples. But this approach can be less effective for harsh RFI environments, where RFI contamination is present in all or a large number of samples. In this paper, we investigate a bias-free weighted sum of samples estimator, where the weights can be computed based on the RFI's statistical properties

    Fast Reinforcement Learning for Energy-Efficient Wireless Communications

    Full text link
    We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. Existing research on this topic utilizes either physical-layer centric solutions, namely power-control and adaptive modulation and coding (AMC), or system-level solutions based on dynamic power management (DPM); however, there is currently no rigorous and unified framework for simultaneously utilizing both physical-layer centric and system-level techniques to achieve the minimum possible energy consumption, under delay constraints, in the presence of stochastic and a priori unknown traffic and channel conditions. In this report, we propose such a framework. We formulate the stochastic optimization problem as a Markov decision process (MDP) and solve it online using reinforcement learning. The advantages of the proposed online method are that (i) it does not require a priori knowledge of the traffic arrival and channel statistics to determine the jointly optimal power-control, AMC, and DPM policies; (ii) it exploits partial information about the system so that less information needs to be learned than when using conventional reinforcement learning algorithms; and (iii) it obviates the need for action exploration, which severely limits the adaptation speed and run-time performance of conventional reinforcement learning algorithms. Our results show that the proposed learning algorithms can converge up to two orders of magnitude faster than a state-of-the-art learning algorithm for physical layer power-control and up to three orders of magnitude faster than conventional reinforcement learning algorithms

    Formation of the postmitotic nuclear envelope from extended ER cisternae precedes nuclear pore assembly

    Get PDF
    During mitosis, the nuclear envelope merges with the endoplasmic reticulum (ER), and nuclear pore complexes are disassembled. In a current model for reassembly after mitosis, the nuclear envelope forms by a reshaping of ER tubules. For the assembly of pores, two major models have been proposed. In the insertion model, nuclear pore complexes are embedded in the nuclear envelope after their formation. In the prepore model, nucleoporins assemble on the chromatin as an intermediate nuclear pore complex before nuclear envelope formation. Using live-cell imaging and electron microscope tomography, we find that the mitotic assembly of the nuclear envelope primarily originates from ER cisternae. Moreover, the nuclear pore complexes assemble only on the already formed nuclear envelope. Indeed, all the chromatin-associated Nup 107–160 complexes are in single units instead of assembled prepores. We therefore propose that the postmitotic nuclear envelope assembles directly from ER cisternae followed by membrane-dependent insertion of nuclear pore complexes
    • …
    corecore