7 research outputs found

    Learning Complex Policy Distribution with CEM Guided Adversarial Hypernetwork

    No full text
    Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyperparameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural exponential family distribution such as multivariate Gaussian is used to parameterize the policy distribution. Using a multivariate Gaussian limits the quality of CEM policies as the search becomes confined to a less representative subspace. We address this drawback by using an adversarially-trained hypernetwork, enabling a richer and complex representation of the policy distribution. To achieve better training stability and faster convergence, we use a multivariate Gaussian CEM policy to guide our adversarial training process. Experiments demonstrate that our approach outperforms state-of-the-art CEM-based methods by 15.8% in terms of rewards while achieving faster convergence. Results also show that our approach is less sensitive to hyper-parameters than other deep-RL methods such as REINFORCE, DDPG and DQN.Interactive Intelligenc

    Teacher-apprentices RL (TARL): leveraging complex policy distribution through generative adversarial hypernetwork in reinforcement learning

    No full text
    Typically, a Reinforcement Learning (RL) algorithm focuses in learning a single deployable policy as the end product. Depending on the initialization methods and seed randomization, learning a single policy could possibly leads to convergence to different local optima across different runs, especially when the algorithm is sensitive to hyper-parameter tuning. Motivated by the capability of Generative Adversarial Networks (GANs) in learning complex data manifold, the adversarial training procedure could be utilized to learn a population of good-performing policies instead. We extend the teacher-student methodology observed in the Knowledge Distillation field in typical deep neural network prediction tasks to RL paradigm. Instead of learning a single compressed student network, an adversarially-trained generative model (hypernetwork) is learned to output network weights of a population of good-performing policy networks, representing a school of apprentices. Our proposed framework, named Teacher-Apprentices RL (TARL), is modular and could be used in conjunction with many existing RL algorithms. We illustrate the performance gain and improved robustness by combining TARL with various types of RL algorithms, including direct policy search Cross-Entropy Method, Q-learning, Actor-Critic, and policy gradient-based methods.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Interactive Intelligenc

    Identification and degradation of structural extracellular polymeric substances in waste activated sludge via a polygalacturonate-degrading consortium

    No full text
    By maintaining the cell integrity of waste activated sludge (WAS), structural extracellular polymeric substances (St-EPS) resist WAS anaerobic fermentation. This study investigates the occurrence of polygalacturonate in WAS St-EPS by combining chemical and metagenomic analyses that identify ∼22% of the bacteria, including Ferruginibacter and Zoogloea, that are associated with polygalacturonate production using the key enzyme EC 5.1.3.6. A highly active polygalacturonate-degrading consortium (GDC) was enriched and the potential of this GDC for degrading St-EPS and promoting methane production from WAS was investigated. The percentage of St-EPS degradation increased from 47.6% to 85.2% after inoculation with the GDC. Methane production was also increased by up to 2.3 times over a control group, with WAS destruction increasing from 11.5% to 28.4%. Zeta potential and rheological behavior confirmed the positive effect which GDC has on WAS fermentation. The major genus in the GDC was identified as Clostridium (17.1%). Extracellular pectate lyases (EC 4.2.2.2 and 4.2.2.9), excluding polygalacturonase (EC 3.2.1.15), were observed in the metagenome of the GDC and most likely play a core role in St-EPS hydrolysis. Dosing with GDC provides a good biological method for St-EPS degradation and thereby enhances the conversion of WAS to methane.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.BT/Environmental Biotechnolog

    Modelling of oil spill trajectory for 2011 Penglai 19-3 coastal drilling field, China

    No full text
    An oil particle trajectory model was developed and was applied to the 2011 Penglai 19-3 subsurface oil spill in the Chinese Bohai Sea. The three dimensional model simulated ocean currents fields and utilised meteorology data from the local measurement station to drift spilled oil. In such a model, the movement of the particles as the sum of deterministic advection and random diffusion were determined by using the Lagrangian algorithm. The simulation fitted well with observations of actual oil sightings, which showed that oil particles spread southeast/eastward to the Bohai Strait, China. This estimation agreed with actual official combat activities near the spill site during the month of the June-July, 2011. (C) 2015 Published by Elsevier Inc.An oil particle trajectory model was developed and was applied to the 2011 Penglai 19-3 subsurface oil spill in the Chinese Bohai Sea. The three dimensional model simulated ocean currents fields and utilised meteorology data from the local measurement station to drift spilled oil. In such a model, the movement of the particles as the sum of deterministic advection and random diffusion were determined by using the Lagrangian algorithm. The simulation fitted well with observations of actual oil sightings, which showed that oil particles spread southeast/eastward to the Bohai Strait, China. This estimation agreed with actual official combat activities near the spill site during the month of the June-July, 2011. (C) 2015 Published by Elsevier Inc

    Water use strategies for two dominant tree species in pure and mixed plantations of the semiarid Chinese Loess Plateau

    No full text
    Understanding the water sources and physiological responses to soil moisture pulses for plantation species, particularly in mixed plantations, are essential to assess the water use strategy and vegetation restoration in semiarid regions. We used hydrogen stable isotopes in plant and soil water to determine the potential water sources for Pinus tabuliformis and Hippophae rhamnoides in both pure and mixed plantations in the semiarid Chinese Loess Plateau during the vigorous growing season (June-August) in 2016. Stomatal conductance (g(c)), midday leaf water potential ((m)) and photosynthetic rate (P-r) were measured concurrently to analyse the physiological response. The P.tabuliformis in the pure plantation depended largely on shallow and middle soil layers regardless of precipitation amount, permitting this species to maintain stable (m) at the expense of P-r through stomatal control. In contrast, H.rhamnoides in the pure plantation shifted its water source from shallow to deep soil layers following decreases in precipitation, allowing this species to maintain stable g(c) and P-r at the expense of (m). Thus, P.tabuliformis and H.rhamnoides displayed isohydric and anisohydric behaviour, respectively. Additionally, both species in mixed plantations largely absorbed water from shallow soil layers and shifted to deep soil layers when precipitation decreased. Mixed afforestation significantly reduced (p<.05) P-r for P.tabuliformis and (m) for H.rhamnoides. Although contrasting physiological responses were adopted by these species, the major proportion of water resources were competitively obtained from similar soil depthsindicating that their mixed afforestation requires further investigation

    Sap flow characteristics and responses to summer rainfall for Pinus tabulaeformis and Hippophae rhamnoides in the Loess hilly region of China

    No full text
    As a major driving element of the structure and function of arid and semiarid ecosystems, rainfall is the essential factor limiting plant biological processes. To clarify the characteristics of transpiration and responses to summer rainfall, sap flow density (F-d) of Pinus tabulaeformis and Hippophae rhamnoides was monitored using thermal dissipation probes. In addition, midday leaf water potential (psi(m)) and leaf stomatal conductance (G(s)) were also analyzed to determine water use strategies. The results indicated that the diurnal variation in the normalized F-d values exhibited a single-peak curve for P.tabulaeformis, while H.rhamnoides showed multiple peaks. The normalized F-d for P.tabulaeformis remained relatively stable regardless of rainfall events. However, there was also a significant increase in the normalized F-d for H.rhamnoides in response to rainfall in June and August (p < .05), although no significant differences were observed in July. The normalized F-d values for P.tabulaeformis and H.rhamnoides fitted well with the derived variable of transpiration, an integrated index calculated from the vapor pressure deficit and solar radiation (R-s), using an exponential saturation function. The differences in fitting coefficients suggested that H.rhamnoides showed more sensitivity to summer rainfall (p < .01) than P.tabulaeformis. Furthermore, during the study period, P.tabulaeformis reduced G(s) as soil water decreased, maintaining a relatively constant psi(m); while H.rhamnoides allowed large fluctuations in (m) to maintain G(s). Therefore, P.tabulaeformis and H.rhamnoides should be considered isohydric and anisohydric species, respectively. And more consideration should be taken for H.rhamnoides in the afforestation activities and the local plantation management under the context of the frequently seasonal drought in the loess hilly region

    Characteristics of dissolved organic matter (DOM) and relationship with dissolved mercury in Xiaoqing River-Laizhou Bay estuary, Bohai Sea, China

    No full text
    Because of heterogeneous properties, dissolved organic matter (DOM) is known to control the environmental fate of a variety of organic pollutants and trace metals in aquatic systems. Here we report absorptive and fluorescence properties of DOM, in concurrence with concentrations of dissolved mercury (Hg), along the Xiaoqing River-Laizhou Bay estuary system located in the Bohai Sea of China. A mixing model consisting of the two end-members terrestrial and aquatic DOM demonstrated that terrestrial signatures decreased significantly from the river into the estuary. Quasi-conservative mixing behavior of DOM sources suggests that the variations in the average DOM composition were governed by physical processes (e.g., dilution) rather than by new production and/or degradation processes. In contrast to some previous studies of river-estuary systems, the Xiaoqing River-Laizhou Bay estuary system displayed a non-significant correlation between DOM and Hg quantities. Based on this and the variation of Hg concentration along the salinity gradient, we concluded that Hg showed a non-conservative mixing behavior of suggested end-member sources. Thus, rather than mixing, Hg concentration variations seemed to be controlled by biogeochemical processes. (C) 2016 Elsevier Ltd. All rights reserved
    corecore