Search CORE

52 research outputs found

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

Author: Abbeel Pieter
Andrychowicz Marcin
Peng Xue Bin
Zaremba Wojciech
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2018
Field of study

Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the characteristics of the simulator. Due to modeling error, strategies that are successful in simulation may not transfer to their real world counterparts. In this paper, we demonstrate a simple method to bridge this "reality gap". By randomizing the dynamics of the simulator during training, we are able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained. This adaptivity enables the policies to generalize to the dynamics of the real world without any training on the physical system. Our approach is demonstrated on an object pushing task using a robotic arm. Despite being trained exclusively in simulation, our policies are able to maintain a similar level of performance when deployed on a real robot, reliably moving an object to a desired location from random initial configurations. We explore the impact of various design decisions and show that the resulting policies are robust to significant calibration error

arXiv.org e-Print Archive

Crossref

Overcoming Exploration in Reinforcement Learning with Demonstrations

Author: Abbeel Pieter
Andrychowicz Marcin
McGrew Bob
Nair Ashvin
Zaremba Wojciech
Publication venue
Publication date: 25/02/2018
Field of study

Exploration in environments with sparse rewards has been a persistent problem in reinforcement learning (RL). Many tasks are natural to specify with a sparse reward, and manually shaping a reward function can result in suboptimal performance. However, finding a non-zero reward is exponentially more difficult with increasing task horizon or action dimensionality. This puts many real-world tasks out of practical reach of RL methods. In this work, we use demonstrations to overcome the exploration problem and successfully learn to perform long-horizon, multi-step robotics tasks with continuous control such as stacking blocks with a robot arm. Our method, which builds on top of Deep Deterministic Policy Gradients and Hindsight Experience Replay, provides an order of magnitude of speedup over RL on simulated robotics tasks. It is simple to implement and makes only the additional assumption that we can collect a small set of demonstrations. Furthermore, our method is able to solve tasks not solvable by either RL or behavior cloning alone, and often ends up outperforming the demonstrator policy.Comment: 8 pages, ICRA 201

arXiv.org e-Print Archive

Crossref

Asymmetric Actor Critic for Image-Based Robot Learning

Author: Abbeel Pieter
Andrychowicz Marcin
Pinto Lerrel
Welinder Peter
Zaremba Wojciech
Publication venue
Publication date: 17/10/2017
Field of study

Deep reinforcement learning (RL) has proven a powerful technique in many sequential decision making domains. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator. While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator. In this work, we exploit the full state observability in the simulator to train better policies which take as input only partial observations (RGBD images). We do this by employing an actor-critic training algorithm in which the critic is trained on full states while the actor (or policy) gets rendered images as input. We show experimentally on a range of simulated tasks that using these asymmetric inputs significantly improves performance. Finally, we combine this method with domain randomization and show real robot experiments for several tasks like picking, pushing, and moving a block. We achieve this simulation to real world transfer without training on any real world data.Comment: Videos of experiments can be found at http://www.goo.gl/b57WT

arXiv.org e-Print Archive

Crossref

Expression of proteins associated with therapy resistance in rhabdomyosarcoma and neuroblastoma tumour cells

Author: Pituch-Noworolska Anna
Wieczorek Agnieszka
Zaremba Marcin
Publication venue
Publication date: 01/01/2009
Field of study

The activity of multidrug resistance (MDR) proteins in tumour cells is associated with an increased resistance to therapy and in consequence with a decreased effectiveness of chemotherapy. The majority of MDR molecules belong to a family of ABC (ATP binding cassette) transporters. Neuroblastoma (NBL) and rhabdomyosarcoma (RMS) are common solid tumours of childhood. The response to therapy is better in NBL, worse in RMS, but still unsatisfactory despite surgery and aggressive chemotherapy. The immunohistochemical staining for p-gp (p-glycoprotein), MRP1 (multidrug resistance associated protein 1), BCRP (breast cancer resistance protein) and LRP (lung resistance protein) expression was performed in primary tumour sections of NBL (10 cases) and RMS (10 cases). A different pattern of MDR expression in NBL and RMS were noted. In NBL, MRP1 was expressed in all studied tumours, p-gp, BCRP only in 3 out of 10 tumours, LRP, in 4 cases. The combination of more than one protein was noted in the majority of NBL tumours. In RMS, the expression of 3 or 4 MDR proteins was noted in 9 cases. The high expression of an MDR protein profile in RMS suggests various mechanisms acting simultaneously, which might explain chemotherapy resistance and a low percentage of long-time survival in this tumour

Jagiellonian Univeristy Repository

Domain Randomization and Generative Models for Robotic Grasping

Author: Abbeel Pieter
Andrychowicz Marcin
Biewald Lukas
Duan Rocky
Handa Ankur
Kumar Vikash
McGrew Bob
Schneider Jonas
Tobin Joshua
Welinder Peter
Zaremba Wojciech
Publication venue
Publication date: 03/04/2018
Field of study

Deep learning-based robotic grasping has made significant progress thanks to algorithmic improvements and increased data availability. However, state-of-the-art models are often trained on as few as hundreds or thousands of unique object instances, and as a result generalization can be a challenge. In this work, we explore a novel data generation pipeline for training a deep neural network to perform grasp planning that applies the idea of domain randomization to object synthesis. We generate millions of unique, unrealistic procedurally generated objects, and train a deep neural network to perform grasp planning on these objects. Since the distribution of successful grasps for a given object can be highly multimodal, we propose an autoregressive grasp planning model that maps sensor inputs of a scene to a probability distribution over possible grasps. This model allows us to sample grasps efficiently at test time (or avoid sampling entirely). We evaluate our model architecture and data generation pipeline in simulation and the real world. We find we can achieve a

>

90% success rate on previously unseen realistic objects at test time in simulation despite having only been trained on random objects. We also demonstrate an 80% success rate on real-world grasp attempts despite having only been trained on random simulated objects.Comment: 8 pages, 11 figures. Submitted to 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018

arXiv.org e-Print Archive

Crossref

“PI OF THE SKY” OFF-LINE EXPERIMENT WITH GLORIA

Author: Cwiek Arkadiusz
Cwiok Mikołaj
Majcher Ariel
Mankiewicz Lech
Zaremba Marcin
Zarnecki Aleksander F.
Publication venue: 'Czech Technical University in Prague - Central Library'
Publication date: 01/06/2014
Field of study

GLORIA is the first free and open-access network of robotic telescopes in the world. Based on the Web 2.0 environment, amateur and professional users can do research in astronomy by observing with robotic telescope, and/or analyzing data acquired with GLORIA, or from other free access databases. The GLORIA project develops free standards, protocols and tools for controlling Robotic Telescopes and related instrumentation, for scheduling observations in the telescope network, and for conducting so-called off-line experiments based on the analysis of astronomical data. This contribution summarizes the implementation and results from the first research level off-line demonstrator experiment implemented in GLORIA, which was based on data collected with the “Pi of the Sky” telescope in Chile

Directory of Open Access Journals

CTU Open Journal Systems (Czech Technical University, Prague / České vysoké učení technické v Praze)

Evaluation of the In vitro cytotoxic activity of caffeic acid derivatives and liposomal formulation against pancreatic cancer cell lines

Author: Cybulski Marcin
Gubernator Jerzy
Jaromin Anna
Sidoryk Katarzyna
Zagórska Agnieszka
Zaremba-Czogalla Magdalena
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Pancreatic cancer belongs to the most aggressive group of cancers, with very poor prognosis. Therefore, there is an important need to find more potent drugs that could deliver an improved therapeutic approach. In the current study we searched for selective and effective caffeic acid derivatives. For this purpose, we analyzed twelve compounds and evaluated their in vitro cytotoxic activity against two human pancreatic cancer cell lines, along with a control, normal fibroblast cell line, by the classic MTT assay. Six out of twelve tested caffeic acid derivatives showed a desirable effect. To improve the therapeutic efficacy of such active compounds, we developed a formulation where caffeic acid derivative (7) was encapsulated into liposomes composed of soybean phosphatidylcholine and DSPE-PEG2000. Subsequently, we analyzed the properties of this formulation in terms of basic physical parameters (such as size, zeta potential, stability at 4 °C and morphology), hemolytic and cytotoxic activity and cellular uptake. Overall, the liposomal formulation was found to be stable, non-hemolytic and had activity against pancreatic cancer cells (IC50 19.44 µM and 24.3 µM, towards AsPC1 and BxPC3 cells, respectively) with less toxicity against normal fibroblasts. This could represent a promising alternative to currently available treatment options

Multidisciplinary Digital Publishing Institute

Jagiellonian Univeristy Repository

POMIAR CZASU MARTWEGO METODĄ DWÓCH ŹRÓDEŁ – OPTYMIZACJA PODZIAŁU CZASU POMIARU

Author: Brzeski Piotr
Domański Grzegorz
Dziewiecki Michał
Konarzewski Bogumił
Kurjata Robert
Marzec Janusz
Rychter Andrzej
Smolik Waldemar
Szabatin Roman
Zaremba Krzysztof
Ziembicki Marcin
Publication venue: 'Index Copernicus'
Publication date: 01/01/2018
Field of study

The article presents the analysis of the dead time measurement using two sources for a non-paralyzable detector. It determined the optimum division of count rate measurement time between both source measurement and a single source one. Results of the work can be used to optimize dead time measurement for systems which count photons or particles.W artykule zaprezentowano analizę pomiaru czasu martwego detektora nieparaliżowalnego metodą dwóch źródeł. Wyznaczono optymalny podział czasu pomiaru częstości zliczeń dla pomiaru jednym i dwoma źródłami. Wyniki pracy mogą być wykorzystane do optymalizacji systemów zliczających fotony lub cząstki

Biblioteka Nauki - repozytorium artykuÅÃ³w

Crossref

Lublin University of Technology Journals