16,788 research outputs found

    Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective

    Full text link
    In reinforcement learning, a decision needs to be made at some point as to whether it is worthwhile to carry on with the learning process or to terminate it. In many such situations, stochastic elements are often present which govern the occurrence of rewards, with the sequential occurrences of positive rewards randomly interleaved with negative rewards. For most practical learners, the learning is considered useful if the number of positive rewards always exceeds the negative ones. A situation that often calls for learning termination is when the number of negative rewards exceeds the number of positive rewards. However, while this seems reasonable, the error of premature termination, whereby termination is enacted along with the conclusion of learning failure despite the positive rewards eventually far outnumber the negative ones, can be significant. In this paper, using combinatorial analysis we study the error probability in wrongly terminating a reinforcement learning activity which undermines the effectiveness of an optimal policy, and we show that the resultant error can be quite high. Whilst we demonstrate mathematically that such errors can never be eliminated, we propose some practical mechanisms that can effectively reduce such errors. Simulation experiments have been carried out, the results of which are in close agreement with our theoretical findings.Comment: Short Paper in AIKE 201

    Comparing two financial crises: the case of Hong Kong real estate markets

    Get PDF
    Hong Kong is no stranger to bubbles or crisis. During the Asian Financial Crisis(AFC), the Hong Kong housing price index drops more than 50% in less than a year. The same market then experiences the Internet Bubble, the SARS attack, and recently the Global Financial Crisis (GFC). This paper attempts to provide some “stylized facts” of the real estate markets and the macroeconomy, and follow the event-study methodology to examine whether the markets behave differently in the AFC and GFC, and discuss the possible linkage to the change in government policies (“learning effect”) and the flow of Chinese consumers and investors to Hong Kong (“China factor”).regime switching, structural change, small open economy, bounded rationality, banking policy

    Energy Efficient User Association and Power Allocation in Millimeter Wave Based Ultra Dense Networks with Energy Harvesting Base Stations

    Full text link
    Millimeter wave (mmWave) communication technologies have recently emerged as an attractive solution to meet the exponentially increasing demand on mobile data traffic. Moreover, ultra dense networks (UDNs) combined with mmWave technology are expected to increase both energy efficiency and spectral efficiency. In this paper, user association and power allocation in mmWave based UDNs is considered with attention to load balance constraints, energy harvesting by base stations, user quality of service requirements, energy efficiency, and cross-tier interference limits. The joint user association and power optimization problem is modeled as a mixed-integer programming problem, which is then transformed into a convex optimization problem by relaxing the user association indicator and solved by Lagrangian dual decomposition. An iterative gradient user association and power allocation algorithm is proposed and shown to converge rapidly to an optimal point. The complexity of the proposed algorithm is analyzed and the effectiveness of the proposed scheme compared with existing methods is verified by simulations.Comment: to appear, IEEE Journal on Selected Areas in Communications, 201

    Stochastic Reinforcement Learning

    Full text link
    In reinforcement learning episodes, the rewards and punishments are often non-deterministic, and there are invariably stochastic elements governing the underlying situation. Such stochastic elements are often numerous and cannot be known in advance, and they have a tendency to obscure the underlying rewards and punishments patterns. Indeed, if stochastic elements were absent, the same outcome would occur every time and the learning problems involved could be greatly simplified. In addition, in most practical situations, the cost of an observation to receive either a reward or punishment can be significant, and one would wish to arrive at the correct learning conclusion by incurring minimum cost. In this paper, we present a stochastic approach to reinforcement learning which explicitly models the variability present in the learning environment and the cost of observation. Criteria and rules for learning success are quantitatively analyzed, and probabilities of exceeding the observation cost bounds are also obtained.Comment: AIKE 201

    Microcrystalline silicon growth for heterojunction solar cells

    Get PDF
    Microcrystalline Si (m-Si) films with a 1.7eV energy bandgap and crystal size of several hundred A were e-beam evaporated on single crystalline Si (c-Si) to form a heterojunction with the substrate, or a window layer to a single crystalline p-n junction (heteroface structure). The goal was to enhance Voc by such uses of the larger bandgap m-Si, with the intriguing prospect of forming heterostructures with exact lattice match on each layer. The heterojunction structure was affected by interface and shunting problems and the best Voc achieved was only 482mV, well below that of single crystal Si homojunctions. The heteroface structure showed promise for some of the samples with p m-Si/p-n structure (the complementary structure did not show any improvement). Although several runs with different deposition conditions were run, the results were inconsistent. Any Voc enhancement obtained was too small to compensate for the current loss due to the extra absorption and poor carrier transport properties of the m-Si film

    Silicon solar cell process development, fabrication, and analysis

    Get PDF
    Two large cast ingots were evaluated. Solar cell performance versus substrate position within the ingots was obtained and the results are presented. Dendritic web samples were analyzed in terms of structural defects, and efforts were made to correlate the data with the performance of solar cells made from the webs

    Silicon solar cell process development, fabrication and analysis

    Get PDF
    Solar cells were fabricated from EFG ribbons dendritic webs, cast ingots by heat exchanger method, and cast ingots by ubiquitous crystallization process. Baseline and other process variations were applied to fabricate solar cells. EFG ribbons grown in a carbon-containing gas atmosphere showed significant improvement in silicon quality. Baseline solar cells from dendritic webs of various runs indicated that the quality of the webs under investigation was not as good as the conventional CZ silicon, showing an average minority carrier diffusion length of about 60 um versus 120 um of CZ wafers. Detail evaluation of large cast ingots by HEM showed ingot reproducibility problems from run to run and uniformity problems of sheet quality within an ingot. Initial evaluation of the wafers prepared from the cast polycrystalline ingots by UCP suggested that the quality of the wafers from this process is considerably lower than the conventional CZ wafers. Overall performance was relatively uniform, except for a few cells which showed shunting problems caused by inclusions

    On the capacities of bipartite Hamiltonians and unitary gates

    Get PDF
    We consider interactions as bidirectional channels. We investigate the capacities for interaction Hamiltonians and nonlocal unitary gates to generate entanglement and transmit classical information. We give analytic expressions for the entanglement generating capacity and entanglement-assisted one-way classical communication capacity of interactions, and show that these quantities are additive, so that the asymptotic capacities equal the corresponding 1-shot capacities. We give general bounds on other capacities, discuss some examples, and conclude with some open questions.Comment: V3: extensively rewritten. V4: a mistaken reference to a conjecture by Kraus and Cirac [quant-ph/0011050] removed and a mistake in the order of authors in Ref. [53] correcte
    corecore