1,505 research outputs found
Hardware-Efficient Scalable Reinforcement Learning Systems
Reinforcement Learning (RL) is a machine learning discipline in which an agent learns by interacting with its environment. In this paradigm, the agent is required to perceive its state and take actions accordingly. Upon taking each action, a numerical reward is provided by the environment. The goal of the agent is thus to maximize the aggregate rewards it receives over time. Over the past two decades, a large variety of algorithms have been proposed to select actions in order to explore the environment and gradually construct an e¤ective strategy that maximizes the rewards. These RL techniques have been successfully applied to numerous real-world, complex applications including board games and motor control tasks.
Almost all RL algorithms involve the estimation of a value function, which indicates how good it is for the agent to be in a given state, in terms of the total expected reward in the long run. Alternatively, the value function may re‡ect on the impact of taking a particular action at a given state. The most fundamental approach for constructing such a value function consists of updating a table that contains a value for each state (or each state-action pair). However, this approach is impractical for large scale problems, in which the state and/or action spaces are large. In order to deal with such problems, it is necessary to exploit the generalization capabilities of non-linear function approximators, such as arti…cial neural networks.
This dissertation focuses on practical methodologies for solving reinforcement learning problems with large state and/or action spaces. In particular, the work addresses scenarios in which an agent does not have full knowledge of its state, but rather receives partial information about its environment via sensory-based observations. In order to address such intricate problems, novel solutions for both tabular and function-approximation based RL frameworks are proposed. A resource-efficient recurrent neural network algorithm is presented, which exploits adaptive step-size techniques to improve learning characteristics. Moreover, a consolidated actor-critic network is introduced, which omits the modeling redundancy found in typical actor-critic systems. Pivotal concerns are the scalability and speed of the learning algorithms, for which we devise architectures that map efficiently to hardware. As a result, a high degree of parallelism can be achieved. Simulation results that correspond to relevant testbench problems clearly demonstrate the solid performance attributes of the proposed solutions
Effects of different overwintering conditions on spring phenology of the seasonal leaves of different woodland strawberry (Fragaria vesca) genotypes
Woodland strawberry (Fragaria vesca) is a perennial herb in the Rosaceae family with dimorphic leaves, summer and winter leaves, adapted to seasonal climate. Woodland strawberry produces a new set of leaves in spring that are photosynthetically active throughout the summer season (summer leaves), and the leaves senescence in autumn when they are replaced by a new set of leaves (winter leaves). The winter leaves retain photosynthetic capacity under the snow cover throughout the winter season, which prolong the photosynthetic period of the species. With the world-wide climate warming, the thickness of winter snow is decreasing, which can affect overwintering and spring phenology of plants. This thesis focuses on springtime ecophysiology and phenology of the senescing winter leaves and the formation of new summer leaves of woodland strawberry genotypes of different European origin. The 15 different genotypes of woodland strawberry are from Iceland, Italy and Norway, and they originate from different environments that are geographically separated from each other, so the populations are genetically distinct. In this study, these genotypes were kept at two different overwintering sites, coastal site at the Ã…land islands with mild temperatures, and continental site in Lammi with a persistent snow cover. According to the results all 15 genotypes showed earlier development of the summer leaves and earlier senescence of winter leaves in the group with Ã…land as overwintering site than in the group with Lammi. Another important finding is that the first summer leaves produced in spring begun to senesce shortly after they are fully developed and were replaced by later formed summer leaves. Specifically, the dates of summer leaf formation, flowering and stolon production were advanced, and the dates of winter leaf senescence were also advanced. The value of different leaf types to chlorophyll fluorescence was also lower at the Ã…land site. Therefore, it can be concluded that overwintering conditions have an effect on the subsequent phenological development in spring. In the context of global climate change, the spring development of woodland strawberry will be earlier, and the senescence of winter leaves will also be earlier
Stress-strain analysis of Aikou rockfill dam with asphalt-concrete core
AbstractAikou rockfill dam with asphalt-concrete core is situated in a karst area in Chongqing City, China. In order to study the operative conditions of the rockfill dam, especially those of the asphalt-concrete core, the Duncan model is adopted to compute the stress and strain of both the rockfill dam and the asphalt-concrete core after karst grouting and other treatments. The results indicate that the complicated stress and deformation of both the dam body and the core are within reasonable ranges. It is shown that structure design and foundation treatment of the dam are feasible and can be used as a reference for other similar projects
Strong Optical and UV Intermediate-Width Emission Lines in the Quasar SDSS J232444.80-094600.3: Dust-Free and Intermediate-Density Gas at the Skin of Dusty Torus ?
Emission lines from the broad emission line region (BELR) and the narrow
emission line region (NELR) of active galactic nuclei (AGNs) are extensively
studied. However, between these two regions emission lines are rarely detected.
We present a detailed analysis of a quasar SDSS J232444.80-094600.3 (SDSS
J23240946), which is remarkable for its strong intermediate-width emission
lines (IELs) with FWHM 1800 \kmps. The IEL component is presented in
different emission lines, including the permitted lines \lya\ 1216,
\civ\ 1549, semiforbidden line \ciii\ 1909, and forbidden
lines \oiii\ 4959, 5007. With the aid of photo-ionization
models, we found that the IELs are produced by gas with a hydrogen density of
, a distance to the central
ionizing source of pc, a covering factor of CF 6\%, and a
dust-to-gas ratio of times of SMC. We suggest that the strong IELs
of this quasar are produced by nearly dust-free and intermediate-density gas
located at the skin of the dusty torus. Such strong IELs, served as a useful
diagnose, can provide an avenue to study the properties of gas between the BELR
and the NELR
Mercado imobiliário chinês : a maior bolha da história?
Mestrado em Desenvolvimento e Cooperação InternacionalNeste contexto, o presente estudo pretende analisar qual a verdadeira situação do mercado imobiliário chinês, tendo como termo de comparação o que se passou nos EUA em 2008, com o objetivo de perceber qual a evolução futura mais provável. Neste sentido, foi feito um enquadramento teórico do tema e foram recolhidos dados como os preços de venda das casas e as taxas de juro associadas aos empréstimos bancários para efeitos de compra de habitação com o objetivo de concluir se existe uma bolha no mercado imobiliário chinês. Adicionalmente, debruçamo-nos também sobre as medidas tomadas pelo governo da China para evitar uma crise do imobiliário que se possa traduzir numa crise financeira que se alastre a toda a economia pondo em causa o ritmo de desenvolvimento económico que a China tem registado nos últimos anos.In this context, the present study intends to analyze the real situation of the Chinese real estate market, having as comparison term what happened in the USA in 2008, in order to perceive the most probable future developments. In this sense, a theoretical framework of the theme was made and data such as the sale prices of the houses and the interest rates associated with the bank loans for the purchase of housing were collected in order to conclude whether there is a bubble in the Chinese real estate market. In addition,we also study on the measures of Chinese government to avoid a real estate crisis,which could lead to a financial crisis even spreading to the whole economy and undermining China's pace of economic development in recent years.info:eu-repo/semantics/publishedVersio
An Optimized Method for Terrain Reconstruction Based on Descent Images
An optimization method is proposed to perform high-accuracy terrain reconstruction of the landing area of Chang'e III. First, feature matching is conducted using geometric model constraints. Then, the initial terrain is obtained and the initial normal vector of each point is solved on the basis of the initial terrain. By changing the vector around the initial normal vector in small steps a set of new vectors is obtained. By combining these vectors with the direction of light and camera, the functions are set up on the basis of a surface reflection model. Then, a series of gray values is derived by solving the equations. The new optimized vector is recorded when the obtained gray value is closest to the corresponding pixel. Finally, the optimized terrain is obtained after iteration of the vector field. Experiments were conducted using the laboratory images and descent images of Chang'e III. The results showed that the performance of the proposed method was better than that of the classical feature matching method. It can provide a reference for terrain reconstruction of the landing area in subsequent moon exploration missions
- …