8 research outputs found

    Improving Deep Reinforcement Learning Using Graph Convolution and Visual Domain Transfer

    Get PDF
    Recent developments in Deep Reinforcement Learning (DRL) have shown tremendous progress in robotics control, Atari games, board games such as Go, etc. However, model free DRL still has limited use cases due to its poor sampling efficiency and generalization on a variety of tasks. In this thesis, two particular drawbacks of DRL are investigated: 1) the poor generalization abilities of model free DRL. More specifically, how to generalize an agent\u27s policy to unseen environments and generalize to task performance on different data representations (e.g. image based or graph based) 2) The reality gap issue in DRL. That is, how to effectively transfer a policy learned in a simulator to the real world. This thesis makes several novel contributions to the field of DRL which are outlined sequentially in the following. Among these contributions is the generalized value iteration network (GVIN) algorithm, which is an end-to-end neural network planning module extending the work of Value Iteration Networks (VIN). GVIN emulates the value iteration algorithm by using a novel graph convolution operator, which enables GVIN to learn and plan on irregular spatial graphs. Additionally, this thesis proposes three novel, differentiable kernels as graph convolution operators and shows that the embedding-based kernel achieves the best performance. Furthermore, an improvement upon traditional nn-step QQ-learning that stabilizes training for VIN and GVIN is demonstrated. Additionally, the equivalence between GVIN and graph neural networks is outlined and shown that GVIN can be further extended to address both control and inference problems. The final subject which falls under the graph domain that is studied in this thesis is graph embeddings. Specifically, this work studies a general graph embedding framework GEM-F that unifies most of the previous graph embedding algorithms. Based on the contributions made during the analysis of GEM-F, a novel algorithm called WarpMap which outperforms DeepWalk and node2vec in the unsupervised learning settings is proposed. The aforementioned reality gap in DRL prohibits a significant portion of research from reaching the real world setting. The latter part of this work studies and analyzes domain transfer techniques in an effort to bridge this gap. Typically, domain transfer in RL consists of representation transfer and policy transfer. In this work, the focus is on representation transfer for vision based applications. More specifically, aligning the feature representation from source domain to target domain in an unsupervised fashion. In this approach, a linear mapping function is considered to fuse modules that are trained in different domains. Proposed are two improved adversarial learning methods to enhance the training quality of the mapping function. Finally, the thesis demonstrates the effectiveness of domain alignment among different weather conditions in the CARLA autonomous driving simulator

    SYSTEM-ON-CHIP ARCHITECTURES FOR SIGNAL PROCESSING AND COMMUNICAnONS

    No full text
    System-On-Chip (SOC) is one of the most popular Computer Aided Design (CAD) methodologies in electronic system design. In this research, SOC design is investigated in two different types of applications: i) low cost and power efficient applications; and ii) high performance computing applications. To explore low cost and power efficient design, a microcontroller based wireless medical system is investigated. Two wireless communication protocols for medical applications and patient monitoring are analyzed. In addition, the ZigBee stack developed by TI and a medical amplifier are discussed. For high performance SOC applications, implementations of several matrix operations are examined. An improved fixed-point hardware design of QR decomposition is introduced and optimized for Xilinx FPGAs. A Givens Rotation algorithm is implemented using a folded systolic array and the CORDIC algorithm. This approach is highly suitable for high-speed FPGAs or ASIC designs. It is found that the Xilinx XC5VLX110T FPGA is capable of running the QR decomposition at 246MHz.M.S. in Electrical Engineering, May 201

    Efficient Iodine Removal by Porous Biochar-Confined Nano-Cu2O/Cu0: Rapid and Selective Adsorption of Iodide and Iodate Ions

    No full text
    Iodine is a nuclide of crucial concern in radioactive waste management. Nanomaterials selectively adsorb iodine from water; however, the efficient application of nanomaterials in engineering still needs to be developed for radioactive wastewater deiodination. Artemia egg shells possess large surface groups and connecting pores, providing a new biomaterial to remove contaminants. Based on the Artemia egg shell-derived biochar (AES biochar) and in situ precipitation and reduction of cuprous, we synthesized a novel nanocomposite, namely porous biochar-confined nano-Cu2O/Cu0 (C-Cu). The characterization of C-Cu confirmed that the nano-Cu2O/Cu0 was dispersed in the pores of AES biochar, serving in the efficient and selective adsorption of iodide and iodate ions from water. The iodide ion removal by C-Cu when equilibrated for 40 min exhibited high removal efficiency over the wide pH range of 4 to 10. Remarkable selectivity towards both iodide and iodate ions of C-Cu was permitted against competing anions (Cl−/NO3−/SO42−) at high concentrations. The applicability of C-Cu was demonstrated by a packed column test with treated effluents of 1279 BV. The rapid and selective removal of iodide and iodate ions from water is attributed to nanoparticles confined on the AES biochar and pore-facilitated mass transfer. Combining the advantages of the porous biochar and nano-Cu2O/Cu0, the use of C-Cu offers a promising method of iodine removal from water in engineering applications

    Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning.

    No full text
    Over the last decade, there has been significant progress in the field of machine learning for de novo drug design, particularly in deep generative models. However, current generative approaches exhibit a significant challenge as they do not ensure that the proposed molecular structures can be feasibly synthesized nor do they provide the synthesis routes of the proposed small molecules, thereby seriously limiting their practical applicability. In this work, we propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design, Policy Gradient for Forward Synthesis (PGFS), that addresses this challenge by embedding the concept of synthetic accessibility directly into the de novo drug design system. In this setup, the agent learns to navigate through the immense synthetically accessible chemical space by subjecting commercially available small molecule building blocks to valid chemical reactions at every time step of the iterative virtual multi-step synthesis process. The proposed environment for drug discovery provides a highly challenging test-bed for RL algorithms owing to the large state space and high-dimensional continuous action space with hierarchical actions. PGFS achieves state-of-the-art performance in generating structures with high QED and penalized clogP. Moreover, we validate PGFS in an in-silico proof-of-concept associated with three HIV targets. Finally, we describe how the end-to-end training conceptualized in this study represents an important paradigm in radically expanding the synthesizable chemical space and automating the drug discovery process.Comment: added the statistics of top-100 compounds used logP metric with scaled components added values of the initial reactants to the box plots some values in tables are recalculated due to the inconsistent environments on different machines. corresponding benchmarks were rerun with the requirements on github. no significant changes in the results. corrected figures in the Appendi

    Complete and Resilient Documentation for Operational Medical Environments Leveraging Mobile Hands-free Technology in a Systems Approach: Experimental Study

    No full text
    BACKGROUND: Prehospitalization documentation is a challenging task and prone to loss of information, as paramedics operate under disruptive environments requiring their constant attention to the patients. OBJECTIVE: The aim of this study is to develop a mobile platform for hands-free prehospitalization documentation to assist first responders in operational medical environments by aggregating all existing solutions for noise resiliency and domain adaptation. METHODS: The platform was built to extract meaningful medical information from the real-time audio streaming at the point of injury and transmit complete documentation to a field hospital prior to patient arrival. To this end, the state-of-the-art automatic speech recognition (ASR) solutions with the following modular improvements were thoroughly explored: noise-resilient ASR, multi-style training, customized lexicon, and speech enhancement. The development of the platform was strictly guided by qualitative research and simulation-based evaluation to address the relevant challenges through progressive improvements at every process step of the end-to-end solution. The primary performance metrics included medical word error rate (WER) in machine-transcribed text output and an F1 score calculated by comparing the autogenerated documentation to manual documentation by physicians. RESULTS: The total number of 15,139 individual words necessary for completing the documentation were identified from all conversations that occurred during the physician-supervised simulation drills. The baseline model presented a suboptimal performance with a WER of 69.85% and an F1 score of 0.611. The noise-resilient ASR, multi-style training, and customized lexicon improved the overall performance; the finalized platform achieved a medical WER of 33.3% and an F1 score of 0.81 when compared to manual documentation. The speech enhancement degraded performance with medical WER increased from 33.3% to 46.33% and the corresponding F1 score decreased from 0.81 to 0.78. All changes in performance were statistically significant (P\u3c.001). CONCLUSIONS: This study presented a fully functional mobile platform for hands-free prehospitalization documentation in operational medical environments and lessons learned from its implementation
    corecore