Search CORE

1,071 research outputs found

Katakomba: Tools and Benchmarks for Data-Driven NetHack

Author: Kolesnikov Sergey
Kurenkov Vladislav
Nikulin Alexander
Tarasov Denis
Publication venue
Publication date: 26/10/2023
Field of study

NetHack is known as the frontier of reinforcement learning research where learning-based methods still need to catch up to rule-based solutions. One of the promising directions for a breakthrough is using pre-collected datasets similar to recent developments in robotics, recommender systems, and more under the umbrella of offline reinforcement learning (ORL). Recently, a large-scale NetHack dataset was released; while it was a necessary step forward, it has yet to gain wide adoption in the ORL community. In this work, we argue that there are three major obstacles for adoption: resource-wise, implementation-wise, and benchmark-wise. To address them, we develop an open-source library that provides workflow fundamentals familiar to the ORL community: pre-defined D4RL-style tasks, uncluttered baseline implementations, and reliable evaluation tools with accompanying configs and logs synced to the cloud.Comment: Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Source code at https://github.com/corl-team/katakomb

arXiv.org e-Print Archive

CORL: Research-oriented Deep Offline Reinforcement Learning Library

Author: Akimov Dmitry
Kolesnikov Sergey
Kurenkov Vladislav
Nikulin Alexander
Tarasov Denis
Publication venue
Publication date: 26/10/2023
Field of study

CORL is an open-source library that provides thoroughly benchmarked single-file implementations of both deep offline and offline-to-online reinforcement learning algorithms. It emphasizes a simple developing experience with a straightforward codebase and a modern analysis tracking tool. In CORL, we isolate methods implementation into separate single files, making performance-relevant details easier to recognize. Additionally, an experiment tracking feature is available to help log metrics, hyperparameters, dependencies, and more to the cloud. Finally, we have ensured the reliability of the implementations by benchmarking commonly employed D4RL datasets providing a transparent source of results that can be reused for robust evaluation tools such as performance profiles, probability of improvement, or expected online performance.Comment: Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Source code at https://github.com/corl-team/COR

arXiv.org e-Print Archive

Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size

Author: Akimov Dmitry
Kolesnikov Sergey
Kurenkov Vladislav
Nikulin Alexander
Tarasov Denis
Publication venue
Publication date: 20/11/2022
Field of study

Training large neural networks is known to be time-consuming, with the learning duration taking days or even weeks. To address this problem, large-batch optimization was introduced. This approach demonstrated that scaling mini-batch sizes with appropriate learning rate adjustments can speed up the training process by orders of magnitude. While long training time was not typically a major issue for model-free deep offline RL algorithms, recently introduced Q-ensemble methods achieving state-of-the-art performance made this issue more relevant, notably extending the training duration. In this work, we demonstrate how this class of methods can benefit from large-batch optimization, which is commonly overlooked by the deep offline RL community. We show that scaling the mini-batch size and naively adjusting the learning rate allows for (1) a reduced size of the Q-ensemble, (2) stronger penalization of out-of-distribution actions, and (3) improved convergence time, effectively shortening training duration by 3-4x times on average.Comment: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 202

arXiv.org e-Print Archive

Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows

Author: Akimov Dmitriy
Kolesnikov Sergey
Kurenkov Vladislav
Nikulin Alexander
Tarasov Denis
Publication venue
Publication date: 20/11/2022
Field of study

Offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. There are two major challenges in this setting: (1) extrapolation error caused by approximating the value of state-action pairs not well-covered by the training data and (2) distributional shift between behavior and inference policies. One way to tackle these problems is to induce conservatism - i.e., keeping the learned policies closer to the behavioral ones. To achieve this, we build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model, which we use as a conservative action encoder. This Normalizing Flows action encoder is pre-trained in a supervised manner on the offline dataset, and then an additional policy model - controller in the latent space - is trained via reinforcement learning. This approach avoids querying actions outside of the training dataset and therefore does not require additional regularization for out-of-dataset actions. We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms with generative action models on a large portion of datasets.Comment: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 202

arXiv.org e-Print Archive

Testing of the homogeneity of marginal distributions in copula models

Author: Mikhail Nikulin
Sergey Malov
Vilijandas Bagdonavičius
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

Comptes Rendus Mathématique

Numérisation de Documents Anciens Mathématiques

In Vitro and in Silico Liver Models: Current Trends, Challenges and Opportunities

Author: Baranova Ancha
Drapkina Oxana
Gazaryan Irina
Nikulin Sergey
Poloznikov Andrey
Shkurnikov Maxim
Tonevitsky Alexander
Publication venue: Touro Scholar
Publication date: 01/07/2018
Field of study

Most common drug development failures originate from either bioavailability problems, or unexpected toxic effects. The culprit is often the liver, which is responsible for biotransformation of a majority of xenobiotics. Liver may be modeled using liver on a chip devices, which may include established cell lines, primary human cells, and stem cell-derived hepatocyte-like cells. The choice of biological material along with its processing and maintenance greatly influence both the device performance and the resultant toxicity predictions. Impediments to the development of liver on a chip technology include the problems with standardization of cells, limitations imposed by culturing and the necessity to develop more complicated fluidic contours. Fortunately, recent breakthroughs in the development of cell-based reporters, including ones with fluorescent label, permits monitoring of the behavior of the cells embed into the liver on a chip devices. Finally, a set of computational approaches has been developed to model both particular toxic response and the homeostasis of human liver as a whole; these approaches pave a way to enhance the in silico stage of assessment for a potential toxicity

The Touro College and University System

Modélisation analytique de l'essai caractéristique d'emboutissage du godet

Author: AYADI Zoubir
BETTEMBOURG Jean-Paul
NAZAROV Roman
NIKULIN Sergey
NIVOIT Michel
Publication venue: AFM, Maison de la Mécanique, 39/41 rue Louis Blanc - 92400 Courbevoie
Publication date: 27/08/2007
Field of study

L'emboutissage d'un godet occupe une place particulière dans les essais caractéristiques de mise en forme. Cet essai permet à la fois d'étudier la consolidation du matériau, l'effet du frottement, la rupture, le plissement et de construire les courbes limites de formage. Le développement d'un modèle analytique est intéressant car il peut fournir rapidement l'information sur les champs des déformations et des contraintes au cours de l'emboutissage et montrer l'influence des paramètres. Dans un premier temps, nous conduisons une analyse comparative des modèles analytiques issus de la littérature. Ensuite, nous proposons une approche basée entre autres sur l'hypothèse de l'homogénéité des contraintes de serrage induites dans la collerette du godet par le serre–flan. Rappelons que la plupart des autres travaux supposent l'invariance de l'épaisseur de la tôle dans cette zone. Les résultats des différentes approches sont comparés aux expériences ce qui permet de discuter la validité des différentes hypothèses

I-Revues

Activity and stability of PtCo/C electrocatalysts for alcohol oxidation

Author: Aleksey Y. Nikulin
Dmitry D. Mauer
N. Vasilyevich Toporkov
Sergey V. Belenov
Publication venue: 'Voronezh State University'
Publication date: 01/03/2023
Field of study

This study considers the liquid-phase synthesis of PtCo/C catalysts based on CoOx/C composite carriers with different mass fractions of metals and Pt:Co ratios. The purpose of the article is to study the activity of PtCo/C electrocatalysts of various compositions in the oxidation reactions of methanol and ethanol and to compare their characteristics with their commercial PtRu/C and Pt/C analogues. PtCo/С catalysts were synthesised with Pt:Co ratios of 1:1 and 3:1. The specific active surface of the obtained PtCo/C materials was determined, their activity in the oxidation reactions of methanol and ethanol and their resistance to poisoning by intermediate products of alcohol oxidation were studied. The structural and electrochemical characteristics of the obtained PtCo/C catalysts were studied by X-ray diffraction, cyclic voltammetry, and chronoamperometry. It was found that PtCo/C materials with a mass fraction of platinum close to 20% are the most active and stable as compared to their commercial PtRu/C and Pt/C analogues. The presented results show that PtCo/C catalysts are a promising material for direct alcohol fuel cells

Directory of Open Access Journals

Establishing of local population, population dynamics and current abundance of Steller sea lion ( <i>Eumetopias jubatus</i>) in the Commander Islands

Author: Evgeny G. Mamaev
Olga A. Belonovich
Sergey D. Ryazanov
Sergey V. Fomin
Victor S. Nikulin
Vladimir N. Burkanov
Publication venue: 'FSBSI TINRO Center'
Publication date: 01/03/2014
Field of study

The time course of the establishment of a local population of Steller sea lions in the Commander Islands, population dynamics and current abundance were studied using literature published since the 1930s and the author’s observations conducted during breeding seasons 2008-2011. The local population of Steller sea lions started formation in the early 1960s, when mature females first began to populate the islands and the population was fully established in the early 1990s. The whole process of development the Commander Islands Steller sea lion sub-population took about three decades. Abundance of adult and juvenile sea lions fluctuated highly in 1991-2011 without any statistically significant trend, but numbers of pups had a pronounced negative slope mostly due to three sharp declines in pup production in 2000, 2009, and 2011. A total of about 700 animals of age 1+ inhabit the islands during the breeding season and about 200 pups are born annually at the present time. This total number of Steller sea lions is close to the mean value for the period after 1990s. Nevertheless, occasional sharp declines in pup production cause some anxiety, so far as they could lead to extinction of the Steller sea lion sub-population in this area as had occurred in the middle of the 19th century

Directory of Open Access Journals

Migration of the Individuals

Author: Ermak Mikhail Yu.
Ismailova Larisa Yu.
Kholodov Victor A.
Kosikov Sergey V.
Nikulin Ilya A.
Parfenova Irina A.
Petrov Vasiliy D.
Wolfengagen Viacheslav E.
Publication venue: The Author(s). Published by Elsevier B.V.
Publication date: 31/12/2016
Field of study

AbstractThe individuals are modeled by the elements of variable domains. The primitive frame to detect the individual migration from domain to domain is proposed. The supporting computational model is based on a separation of individuals into actual, possible and virtual ones. As was shown, this leads to an adoption of the stage-by-stage cognition model with a pair of evolvents to capture dynamics of the domains – the 2-dimensions model. The first evolvent reflects the generation of the individuals in a domain, the beginning of and canceling out their existence in a domain. The second evolvent reflects the shifts in properties of the individuals. As awaited this unified data model will have the applications to a wide range of models in computer science and Information Technologies

Elsevier - Publisher Connector