Search CORE

69 research outputs found

Learning Real-world Autonomous Navigation by Self-Supervised Environment Synthesis

Author: Nair Anirudh
Stone Peter
Xiao Xuesu
Xu Zifan
Publication venue
Publication date: 10/10/2022
Field of study

Machine learning approaches have recently enabled autonomous navigation for mobile robots in a data-driven manner. Since most existing learning-based navigation systems are trained with data generated in artificially created training environments, during real-world deployment at scale, it is inevitable that robots will encounter unseen scenarios, which are out of the training distribution and therefore lead to poor real-world performance. On the other hand, directly training in the real world is generally unsafe and inefficient. To address this issue, we introduce Self-supervised Environment Synthesis (SES), in which, after real-world deployment with safety and efficiency requirements, autonomous mobile robots can utilize experience from the real-world deployment, reconstruct navigation scenarios, and synthesize representative training environments in simulation. Training in these synthesized environments leads to improved future performance in the real world. The effectiveness of SES at synthesizing representative simulation environments and improving real-world navigation performance is evaluated via a large-scale deployment in a high-fidelity, realistic simulator and a small-scale deployment on a physical robot

arXiv.org e-Print Archive

Benchmarking Reinforcement Learning Techniques for Autonomous Navigation

Author: Liu Bo
Nair Anirudh
Stone Peter
Xiao Xuesu
Xu Zifan
Publication venue
Publication date: 10/10/2022
Field of study

Deep reinforcement learning (RL) has brought many successes for autonomous robot navigation. However, there still exists important limitations that prevent real-world use of RL-based navigation systems. For example, most learning approaches lack safety guarantees; and learned navigation systems may not generalize well to unseen environments. Despite a variety of recent learning techniques to tackle these challenges in general, a lack of an open-source benchmark and reproducible learning methods specifically for autonomous navigation makes it difficult for roboticists to choose what learning methods to use for their mobile robots and for learning researchers to identify current shortcomings of general learning methods for autonomous navigation. In this paper, we identify four major desiderata of applying deep RL approaches for autonomous navigation: (D1) reasoning under uncertainty, (D2) safety, (D3) learning from limited trial-and-error data, and (D4) generalization to diverse and novel environments. Then, we explore four major classes of learning techniques with the purpose of achieving one or more of the four desiderata: memory-based neural network architectures (D1), safe RL (D2), model-based RL (D2, D3), and domain randomization (D4). By deploying these learning techniques in a new open-source large-scale navigation benchmark and real-world environments, we perform a comprehensive study aimed at establishing to what extent can these techniques achieve these desiderata for RL-based navigation systems

arXiv.org e-Print Archive

Recommended from our members

Machine Learning Methods for Local Motion Planning: A Study of End-to-End vs. Parameter Learning

Author: Nair Anirudh
Stone Peter
Warnel Garrett
Xiao Xuesu
Xu Zifan
Publication venue: Machine Learning Methods for Local Motion Planning: A Study of End-to-End vs. Parameter Learning
Publication date: 01/10/2021
Field of study

This conference paper was featured in the October 2021 Good System Network Digest.Office of the VP for Researc

Texas ScholarWorks

Deep Generative Models on 3D Representations: A Survey

Author: Liao Yiyi
Peng Sida
Shen Yujun
Shi Zifan
Xu Yinghao
Publication venue
Publication date: 27/10/2022
Field of study

Generative models, as an important family of statistical modeling, target learning the observed data distribution via generating new instances. Along with the rise of neural networks, deep generative models, such as variational autoencoders (VAEs) and generative adversarial network (GANs), have made tremendous progress in 2D image synthesis. Recently, researchers switch their attentions from the 2D space to the 3D space considering that 3D data better aligns with our physical world and hence enjoys great potential in practice. However, unlike a 2D image, which owns an efficient representation (i.e., pixel grid) by nature, representing 3D data could face far more challenges. Concretely, we would expect an ideal 3D representation to be capable enough to model shapes and appearances in details, and to be highly efficient so as to model high-resolution data with fast speed and low memory cost. However, existing 3D representations, such as point clouds, meshes, and recent neural fields, usually fail to meet the above requirements simultaneously. In this survey, we make a thorough review of the development of 3D generation, including 3D shape generation and 3D-aware image synthesis, from the perspectives of both algorithms and more importantly representations. We hope that our discussion could help the community track the evolution of this field and further spark some innovative ideas to advance this challenging task

arXiv.org e-Print Archive

Neuromorphic Incremental on-chip Learning with Hebbian Weight Consolidation

Author: Chen Chaojin
Cheng Xiang
Ning Zifan
Xu Bo
Yao Wangzi
Zhang Tielin
Publication venue
Publication date: 30/11/2023
Field of study

As next-generation implantable brain-machine interfaces become pervasive on edge device, incrementally learning new tasks in bio-plasticity ways is urgently demanded for Neuromorphic chips. Due to the inherent characteristics of its structure, spiking neural networks are naturally well-suited for BMI-chips. Here we propose Hebbian Weight Consolidation, as well as an on-chip learning framework. HWC selectively masks synapse modifications for previous tasks, retaining them to store new knowledge from subsequent tasks while preserving the old knowledge. Leveraging the bio-plasticity of dendritic spines, the intrinsic self-organizing nature of Hebbian Weight Consolidation aligns naturally with the incremental learning paradigm, facilitating robust learning outcomes. By reading out spikes layer by layer and performing back-propagation on the external micro-controller unit, MLoC can efficiently accomplish on-chip learning. Experiments show that our HWC algorithm up to 23.19% outperforms lower bound that without incremental learning algorithm, particularly in more challenging monkey behavior decoding scenarios. Taking into account on-chip computing on Synsense Speck 2e chip, our proposed algorithm exhibits an improvement of 11.06%. This study demonstrates the feasibility of employing incremental learning for high-performance neural signal decoding in next-generation brain-machine interfaces.Comment: 12 pages, 6 figure

arXiv.org e-Print Archive

Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator

Author: Chen Qifeng
Shen Yujun
Shi Zifan
Xu Yinghao
Yeung Dit-Yan
Zhao Deli
Publication venue
Publication date: 30/09/2022
Field of study

3D-aware image synthesis aims at learning a generative model that can render photo-realistic 2D images while capturing decent underlying 3D shapes. A popular solution is to adopt the generative adversarial network (GAN) and replace the generator with a 3D renderer, where volume rendering with neural radiance field (NeRF) is commonly used. Despite the advancement of synthesis quality, existing methods fail to obtain moderate 3D shapes. We argue that, considering the two-player game in the formulation of GANs, only making the generator 3D-aware is not enough. In other words, displacing the generative mechanism only offers the capability, but not the guarantee, of producing 3D-aware images, because the supervision of the generator primarily comes from the discriminator. To address this issue, we propose GeoD through learning a geometry-aware discriminator to improve 3D-aware GANs. Concretely, besides differentiating real and fake samples from the 2D image space, the discriminator is additionally asked to derive the geometry information from the inputs, which is then applied as the guidance of the generator. Such a simple yet effective design facilitates learning substantially more accurate 3D shapes. Extensive experiments on various generator architectures and training datasets verify the superiority of GeoD over state-of-the-art alternatives. Moreover, our approach is registered as a general framework such that a more capable discriminator (i.e., with a third task of novel view synthesis beyond domain classification and geometry extraction) can further assist the generator with a better multi-view consistency.Comment: Accepted by NeurIPS 2022. Project page: https://vivianszf.github.io/geo

arXiv.org e-Print Archive

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Author: Bi Sai
Li Jiahao
Luan Fujun
Shi Zifan
Sunkavalli Kalyan
Tan Hao
Wang Peng
Wetzstein Gordon
Xu Yinghao
Xu Zexiang
Zhang Kai
Publication venue
Publication date: 15/11/2023
Field of study

We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model incorporates a triplane NeRF representation and can denoise noisy multi-view images via NeRF reconstruction and rendering, achieving single-stage 3D generation in

\sim

30s on single A100 GPU. We train \textbf{DMV3D} on large-scale multi-view image datasets of highly diverse objects using only image reconstruction losses, without accessing 3D assets. We demonstrate state-of-the-art results for the single-image reconstruction problem where probabilistic modeling of unseen object parts is required for generating diverse reconstructions with sharp textures. We also show high-quality text-to-3D generation results outperforming previous 3D diffusion models. Our project website is at: https://justimyhxu.github.io/projects/dmv3d/ .Comment: Project Page: https://justimyhxu.github.io/projects/dmv3d

arXiv.org e-Print Archive

Gaussian Shell Maps for Efficient 3D Human Generation

Author: Abdal Rameen
Chen Qifeng
Kuang Zhengfei
Po Ryan
Shi Zifan
Wetzstein Gordon
Xu Yinghao
Yeung Dit-Yan
Yifan Wang
Publication venue
Publication date: 29/11/2023
Field of study

Efficient generation of 3D digital humans is important in several industries, including virtual reality, social media, and cinematic production. 3D generative adversarial networks (GANs) have demonstrated state-of-the-art (SOTA) quality and diversity for generated assets. Current 3D GAN architectures, however, typically rely on volume representations, which are slow to render, thereby hampering the GAN training and requiring multi-view-inconsistent 2D upsamplers. Here, we introduce Gaussian Shell Maps (GSMs) as a framework that connects SOTA generator network architectures with emerging 3D Gaussian rendering primitives using an articulable multi shell--based scaffold. In this setting, a CNN generates a 3D texture stack with features that are mapped to the shells. The latter represent inflated and deflated versions of a template surface of a digital human in a canonical body pose. Instead of rasterizing the shells directly, we sample 3D Gaussians on the shells whose attributes are encoded in the texture features. These Gaussians are efficiently and differentiably rendered. The ability to articulate the shells is important during GAN training and, at inference time, to deform a body into arbitrary user-defined poses. Our efficient rendering scheme bypasses the need for view-inconsistent upsamplers and achieves high-quality multi-view consistent renderings at a native resolution of

512 \times 512

pixels. We demonstrate that GSMs successfully generate 3D humans when trained on single-view datasets, including SHHQ and DeepFashion.Comment: Project page : https://rameenabdal.github.io/GaussianShellMaps

arXiv.org e-Print Archive