16,956 research outputs found
Performance Dynamics and Termination Errors in Reinforcement Learning: A Unifying Perspective
In reinforcement learning, a decision needs to be made at some point as to
whether it is worthwhile to carry on with the learning process or to terminate
it. In many such situations, stochastic elements are often present which govern
the occurrence of rewards, with the sequential occurrences of positive rewards
randomly interleaved with negative rewards. For most practical learners, the
learning is considered useful if the number of positive rewards always exceeds
the negative ones. A situation that often calls for learning termination is
when the number of negative rewards exceeds the number of positive rewards.
However, while this seems reasonable, the error of premature termination,
whereby termination is enacted along with the conclusion of learning failure
despite the positive rewards eventually far outnumber the negative ones, can be
significant. In this paper, using combinatorial analysis we study the error
probability in wrongly terminating a reinforcement learning activity which
undermines the effectiveness of an optimal policy, and we show that the
resultant error can be quite high. Whilst we demonstrate mathematically that
such errors can never be eliminated, we propose some practical mechanisms that
can effectively reduce such errors. Simulation experiments have been carried
out, the results of which are in close agreement with our theoretical findings.Comment: Short Paper in AIKE 201
Comparing two financial crises: the case of Hong Kong real estate markets
Hong Kong is no stranger to bubbles or crisis. During the Asian Financial Crisis(AFC), the Hong Kong housing price index drops more than 50% in less than a year. The same market then experiences the Internet Bubble, the SARS attack, and recently the Global Financial Crisis (GFC). This paper attempts to provide some “stylized facts” of the real estate markets and the macroeconomy, and follow the event-study methodology to examine whether the markets behave differently in the AFC and GFC, and discuss the possible linkage to the change in government policies (“learning effect”) and the flow of Chinese consumers and investors to Hong Kong (“China factor”).regime switching, structural change, small open economy, bounded rationality, banking policy
Energy Efficient User Association and Power Allocation in Millimeter Wave Based Ultra Dense Networks with Energy Harvesting Base Stations
Millimeter wave (mmWave) communication technologies have recently emerged as
an attractive solution to meet the exponentially increasing demand on mobile
data traffic. Moreover, ultra dense networks (UDNs) combined with mmWave
technology are expected to increase both energy efficiency and spectral
efficiency. In this paper, user association and power allocation in mmWave
based UDNs is considered with attention to load balance constraints, energy
harvesting by base stations, user quality of service requirements, energy
efficiency, and cross-tier interference limits. The joint user association and
power optimization problem is modeled as a mixed-integer programming problem,
which is then transformed into a convex optimization problem by relaxing the
user association indicator and solved by Lagrangian dual decomposition. An
iterative gradient user association and power allocation algorithm is proposed
and shown to converge rapidly to an optimal point. The complexity of the
proposed algorithm is analyzed and the effectiveness of the proposed scheme
compared with existing methods is verified by simulations.Comment: to appear, IEEE Journal on Selected Areas in Communications, 201
Stochastic Reinforcement Learning
In reinforcement learning episodes, the rewards and punishments are often
non-deterministic, and there are invariably stochastic elements governing the
underlying situation. Such stochastic elements are often numerous and cannot be
known in advance, and they have a tendency to obscure the underlying rewards
and punishments patterns. Indeed, if stochastic elements were absent, the same
outcome would occur every time and the learning problems involved could be
greatly simplified. In addition, in most practical situations, the cost of an
observation to receive either a reward or punishment can be significant, and
one would wish to arrive at the correct learning conclusion by incurring
minimum cost. In this paper, we present a stochastic approach to reinforcement
learning which explicitly models the variability present in the learning
environment and the cost of observation. Criteria and rules for learning
success are quantitatively analyzed, and probabilities of exceeding the
observation cost bounds are also obtained.Comment: AIKE 201
Microcrystalline silicon growth for heterojunction solar cells
Microcrystalline Si (m-Si) films with a 1.7eV energy bandgap and crystal size of several hundred A were e-beam evaporated on single crystalline Si (c-Si) to form a heterojunction with the substrate, or a window layer to a single crystalline p-n junction (heteroface structure). The goal was to enhance Voc by such uses of the larger bandgap m-Si, with the intriguing prospect of forming heterostructures with exact lattice match on each layer. The heterojunction structure was affected by interface and shunting problems and the best Voc achieved was only 482mV, well below that of single crystal Si homojunctions. The heteroface structure showed promise for some of the samples with p m-Si/p-n structure (the complementary structure did not show any improvement). Although several runs with different deposition conditions were run, the results were inconsistent. Any Voc enhancement obtained was too small to compensate for the current loss due to the extra absorption and poor carrier transport properties of the m-Si film
Silicon solar cell process development, fabrication, and analysis
Two large cast ingots were evaluated. Solar cell performance versus substrate position within the ingots was obtained and the results are presented. Dendritic web samples were analyzed in terms of structural defects, and efforts were made to correlate the data with the performance of solar cells made from the webs
Silicon solar cell process development, fabrication and analysis
Solar cells were fabricated from EFG ribbons dendritic webs, cast ingots by heat exchanger method, and cast ingots by ubiquitous crystallization process. Baseline and other process variations were applied to fabricate solar cells. EFG ribbons grown in a carbon-containing gas atmosphere showed significant improvement in silicon quality. Baseline solar cells from dendritic webs of various runs indicated that the quality of the webs under investigation was not as good as the conventional CZ silicon, showing an average minority carrier diffusion length of about 60 um versus 120 um of CZ wafers. Detail evaluation of large cast ingots by HEM showed ingot reproducibility problems from run to run and uniformity problems of sheet quality within an ingot. Initial evaluation of the wafers prepared from the cast polycrystalline ingots by UCP suggested that the quality of the wafers from this process is considerably lower than the conventional CZ wafers. Overall performance was relatively uniform, except for a few cells which showed shunting problems caused by inclusions
On the capacities of bipartite Hamiltonians and unitary gates
We consider interactions as bidirectional channels. We investigate the
capacities for interaction Hamiltonians and nonlocal unitary gates to generate
entanglement and transmit classical information. We give analytic expressions
for the entanglement generating capacity and entanglement-assisted one-way
classical communication capacity of interactions, and show that these
quantities are additive, so that the asymptotic capacities equal the
corresponding 1-shot capacities. We give general bounds on other capacities,
discuss some examples, and conclude with some open questions.Comment: V3: extensively rewritten. V4: a mistaken reference to a conjecture
by Kraus and Cirac [quant-ph/0011050] removed and a mistake in the order of
authors in Ref. [53] correcte
- …