144 research outputs found

    Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size

    Full text link
    Training large neural networks is known to be time-consuming, with the learning duration taking days or even weeks. To address this problem, large-batch optimization was introduced. This approach demonstrated that scaling mini-batch sizes with appropriate learning rate adjustments can speed up the training process by orders of magnitude. While long training time was not typically a major issue for model-free deep offline RL algorithms, recently introduced Q-ensemble methods achieving state-of-the-art performance made this issue more relevant, notably extending the training duration. In this work, we demonstrate how this class of methods can benefit from large-batch optimization, which is commonly overlooked by the deep offline RL community. We show that scaling the mini-batch size and naively adjusting the learning rate allows for (1) a reduced size of the Q-ensemble, (2) stronger penalization of out-of-distribution actions, and (3) improved convergence time, effectively shortening training duration by 3-4x times on average.Comment: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 202

    Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows

    Full text link
    Offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. There are two major challenges in this setting: (1) extrapolation error caused by approximating the value of state-action pairs not well-covered by the training data and (2) distributional shift between behavior and inference policies. One way to tackle these problems is to induce conservatism - i.e., keeping the learned policies closer to the behavioral ones. To achieve this, we build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model, which we use as a conservative action encoder. This Normalizing Flows action encoder is pre-trained in a supervised manner on the offline dataset, and then an additional policy model - controller in the latent space - is trained via reinforcement learning. This approach avoids querying actions outside of the training dataset and therefore does not require additional regularization for out-of-dataset actions. We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms with generative action models on a large portion of datasets.Comment: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 202

    CORL: Research-oriented Deep Offline Reinforcement Learning Library

    Full text link
    CORL is an open-source library that provides thoroughly benchmarked single-file implementations of both deep offline and offline-to-online reinforcement learning algorithms. It emphasizes a simple developing experience with a straightforward codebase and a modern analysis tracking tool. In CORL, we isolate methods implementation into separate single files, making performance-relevant details easier to recognize. Additionally, an experiment tracking feature is available to help log metrics, hyperparameters, dependencies, and more to the cloud. Finally, we have ensured the reliability of the implementations by benchmarking commonly employed D4RL datasets providing a transparent source of results that can be reused for robust evaluation tools such as performance profiles, probability of improvement, or expected online performance.Comment: Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Source code at https://github.com/corl-team/COR

    Seed-effect modeling improves the consistency of genome-wide loss-of-function screens and identifies synthetic lethal vulnerabilities in cancer cells

    Get PDF
    Background: Genome-wide loss-of-function profiling is widely used for systematic identification of genetic dependencies in cancer cells; however, the poor reproducibility of RNA interference (RNAi) screens has been a major concern due to frequent off-target effects. Currently, a detailed understanding of the key factors contributing to the sub-optimal consistency is still a lacking, especially on how to improve the reliability of future RNAi screens by controlling for factors that determine their off-target propensity. Methods: We performed a systematic, quantitative analysis of the consistency between two genome-wide shRNA screens conducted on a compendium of cancer cell lines, and also compared several gene summarization methods for inferring gene essentiality from shRNA level data. We then devised novel concepts of seed essentiality and shRNA family, based on seed region sequences of shRNAs, to study in-depth the contribution of seed-mediated off-target effects to the consistency of the two screens. We further investigated two seed-sequence properties, seed pairing stability, and target abundance in terms of their capability to minimize the off-target effects in post-screening data analysis. Finally, we applied this novel methodology to identify genetic interactions and synthetic lethal partners of cancer drivers, and confirmed differential essentiality phenotypes by detailed CRISPR/Cas9 experiments. Results: Using the novel concepts of seed essentiality and shRNA family, we demonstrate how genome-wide loss-of-function profiling of a common set of cancer cell lines can be actually made fairly reproducible when considering seed-mediated off-target effects. Importantly, by excluding shRNAs having higher propensity for off-target effects, based on their seed-sequence properties, one can remove noise from the genome-wide shRNA datasets. As a translational application case, we demonstrate enhanced reproducibility of genetic interaction partners of common cancer drivers, as well as identify novel synthetic lethal partners of a major oncogenic driver, PIK3CA, supported by a complementary CRISPR/Cas9 experiment. Conclusions: We provide practical guidelines for improved design and analysis of genome-wide loss-of-function profiling and demonstrate how this novel strategy can be applied towards improved mapping of genetic dependencies of cancer cells to aid development of targeted anticancer treatments.Peer reviewe

    Thermocatalytic conversion of petroleum paraffin in the presence of tungsten carbide powders

    Get PDF
    Russia occupies the third place in the world in terms of stocks of heavy oil raw materials. The development of deposits of light and medium oils makes it inevitable to involve heavy, as well as residual, petroleum raw materials in processing to meet the growing demand for petroleum products. Increase of the depth of oil processing possible in various ways, one of which is the use of new efficient catalysts, resistant to corrosion, poisoning and coking. Tungsten carbide, meeting these requirements, is a promising starting compound for the production of cracking catalysts for heavy oil feedstocks. The influence of tungsten carbide and its calcination temperature on the composition and yield of oil paraffin cracking products on the resulting catalysts was studied to investigate its catalytic activity, the optimum treatment temperature of tungsten carbide was determined. The high catalytic activity of a WC sample calcined at 420Β°C is shown. Using the physicochemical methods of investigation, the properties of tungsten carbide samples, as well as the composition and properties of the paraffin cracking products in the presence of the catalysts obtained, were studied

    Investigation of Massive Catalyst based on Molybdenum Disulphide by Simultaneous Thermal Analysis and Mass Spectrometry Methods

    Get PDF
    The paper presents the results of experimental studies of massive sulfide catalysts by simultaneous thermal analysis and mass spectrometry. It is found that the STA/MS methods are quite informative for testing the catalyst systems based on MoS2 and are useful in identification of the reference features that could be used to predict their activity. It is also shown that the defect structure of molybdenum disulfide formed during mechanical activation is reflected on the DSC curves

    The concept of modular design of cast iron pistons for diesel internal combustion engines

    Get PDF
    Π Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΊΠ°, ΠΈΠ·Π³ΠΎΡ‚ΠΎΠ²Π»Π΅Π½ΠΈΠ΅ ΠΈ ΠΏΡ€ΠΎΠ΄Π²ΠΈΠΆΠ΅Π½ΠΈΠ΅ Π½Π° Ρ€Ρ‹Π½ΠΊΠ΅ сбыта ΠΏΠΎΡ€ΡˆΠ½Π΅ΠΉ Π”Π’Π‘ Ρ‚Ρ€Π΅Π±ΡƒΠ΅Ρ‚ ΠΏΠΎΠ²Ρ‹ΡˆΠ΅Π½ΠΈΡ уровня конструкторско-тСхнологичСской ΠΏΠΎΠ΄Π³ΠΎΡ‚ΠΎΠ²ΠΊΠΈ. Π’ ΡΡ‚Π°Ρ‚ΡŒΠ΅ Π²ΠΏΠ΅Ρ€Π²Ρ‹Π΅ рассмотрСна концСпция Π°Π²Ρ‚ΠΎΠΌΠ°Ρ‚ΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π½ΠΎΠ³ΠΎ ΠΌΠΎΠ΄ΡƒΠ»ΡŒΠ½ΠΎΠ³ΠΎ проСктирования ΠΏΠΎΡ€ΡˆΠ½Π΅ΠΉ Π”Π’Π‘ Π² Π²ΠΈΠ΄Π΅ систСмы, Π² ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΉ ΠΈΠ½Ρ‚Π΅Π³Ρ€ΠΈΡ€ΡƒΡŽΡ‚ΡΡ всС сфСры Π΄Π΅ΡΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ ΠΏΠΎΠ΄Ρ€Π°Π·Π΄Π΅Π»Π΅Π½ΠΈΠΉ ΠΎΡ‚ ΡƒΡ‡Π΅Ρ‚Π° спроса Ρ€Ρ‹Π½ΠΊΠ° Π΄ΠΎ ΠΏΠ΅Ρ€Π΅Π΄Π°Ρ‡ΠΈ ΠΏΡ€ΠΎΠ΄ΡƒΠΊΡ†ΠΈΠΈ Π·Π°ΠΊΠ°Π·Ρ‡ΠΈΠΊΠ°ΠΌ. Π’ этой систСмС всС основныС Ρ€Π°Π±ΠΎΡ‚Ρ‹ ΠΎΠ±ΡŠΠ΅Π΄ΠΈΠ½Π΅Π½Ρ‹ Π² ΠΎΡ‚Π΄Π΅Π»ΡŒΠ½Ρ‹Π΅ ΠΌΠΎΠ΄ΡƒΠ»ΠΈ, Π²Ρ‹ΠΏΠΎΠ»Π½ΡΡŽΡ‰ΠΈΠ΅ΡΡ ΠΎΠ΄Π½ΠΎΠ²Ρ€Π΅ΠΌΠ΅Π½Π½ΠΎ ΠΏΠΎ Ρ‚Ρ€Π΅ΠΌ направлСниям: ΠΎΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΠΎΠ½Π½ΠΎΠΌ, конструкторском ΠΈ тСхнологичСском. Π”Π°Π»ΡŒΠ½Π΅ΠΉΡˆΠ΅Π΅ ΡΠΎΠ²Π΅Ρ€ΡˆΠ΅Π½ΡΡ‚Π²ΠΎΠ²Π°Π½ΠΈΠ΅ ΠΏΠΎΡ€ΡˆΠ½Π΅ΠΉ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎ с ΡƒΡ‡Π΅Ρ‚ΠΎΠΌ развития Ρ‚Π΅ΠΎΡ€ΠΈΠΈ БАПР ΠΈ Π½Π°ΠΏΡ€Π°Π²Π»Π΅Π½ΠΎ Π½Π° ΠΎΠ±Ρ€Π°Π·ΠΎΠ²Π°Π½ΠΈΠ΅ Π½Π°ΡƒΡ‡Π½Ρ‹Ρ… основ ΠΈΠ½Ρ‚Π΅Π³Ρ€ΠΈΡ€ΠΎΠ²Π°Π½Π½ΠΎΠ³ΠΎ проСктирования. Π‘ ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ ΠΌΠΎΠ΄ΡƒΠ»ΡŒΠ½ΠΎΠΉ систСмы проСктирования Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚Π°Π½Ρ‹ ΠΎΡ€ΠΈΠ³ΠΈΠ½Π°Π»ΡŒΠ½Ρ‹Π΅ конструкции ΠΌΠΎΠ½ΠΎΠ»ΠΈΡ‚Π½Ρ‹Ρ… Ρ‡ΡƒΠ³ΡƒΠ½Π½Ρ‹Ρ… ΠΈ ΠΊΠΎΠΌΠ±ΠΈΠ½ΠΈΡ€ΠΎΠ²Π°Π½Π½Ρ‹Ρ… ΠΏΠΎΡ€ΡˆΠ½Π΅ΠΉ Π΄ΠΈΠ°ΠΌΠ΅Ρ‚Ρ€ΠΎΠΌ 120 ΠΈ 88 ΠΌΠΌ. ΠŸΡ€Π΅Π΄Π»Π°Π³Π°Π΅ΠΌΠ°Ρ ΠΌΠΎΠ΄ΡƒΠ»ΡŒΠ½Π°Ρ систСма ΠΎΡ€Π³Π°Π½ΠΈΠ·Π°Ρ†ΠΈΠΈ конструкторско-тСхнологичСского проСктирования Π΄Π°Π΅Ρ‚ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡ‚ΡŒ ΠΏΠΎΠ»ΡƒΡ‡Π°Ρ‚ΡŒ ΡƒΠ½ΠΈΠ²Π΅Ρ€ΡΠ°Π»ΡŒΠ½Ρ‹Π΅ Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ Π½Π΅ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ для ΠΏΠΎΡ€ΡˆΠ½Π΅ΠΉ Π”Π’Π‘, Π½ΠΎ ΠΈ Π΄Ρ€ΡƒΠ³ΠΈΡ… многоэлСмСнтных Π΄Π΅Ρ‚Π°Π»Π΅ΠΉ машин.Development, manufacturing and marketing of ICE pistons requires an increase in the quality of their design and technological preparation. In this paper, the concept of computeraided modular design of piston engines is considered for the first time as a system that integrates all spheres of activity from meeting market demands to delivering the product to customers. All major works in this system are divided into three individual modules that run simultaneously, that is, organizational, design and technological modules. The pistons can be improved by the development of the CAD theory and the formati on of the scientific basis for integrated design. The modular design system made it possible to design original monolithic and composite cast iron pistons with diameters 20 and 88 mm. The system enables the design and production of not only piston engines but also various multielement machine parts

    Processing of heavy residual feedstock on Mo/Al[2]O[3]-catalytic systems obtained using polyoxomolybdate compounds

    Get PDF
    The urgency of creating new efficient catalysts for the processes of deepening oil refining rises on the background of stricter requirements for the quality of motor fuels, as well as the deterioration of the quality of crude oil for processing, and an increase in the number of distillates of secondary processes involved in the production of commodity petroleum products. In this work, alumina-catalytic systems were synthesized using polyoxomolybdate compounds. The morphology, structure and phase composition of the synthesized catalytic systems were studied using the following analysis methods: scanning electron microscopy, microelement analysis, X-ray phase analysis, X-ray diffraction, electron spectroscopy. It has been established that the Mo/AI[2]O[3] system is active in the process of thermal catalytic conversion of heavy residual raw materials
    • …
    corecore