730 research outputs found
Deterministic learning enhanced neutral network control of unmanned helicopter
In this article, a neural network-based tracking controller is developed for an unmanned helicopter system with guaranteed global stability in the presence of uncertain system dynamics. Due to the coupling and modeling uncertainties of the helicopter systems, neutral networks approximation techniques are employed to compensate the unknown dynamics of each subsystem. In order to extend the semiglobal stability achieved by conventional neural control to global stability, a switching mechanism is also integrated into the control design, such that the resulted neural controller is always valid without any concern on either initial conditions or range of state variables. In addition, deterministic learning is applied to the neutral network learning control, such that the adaptive neutral networks are able to store the learned knowledge that could be reused to construct neutral network controller with improved control performance. Simulation studies are carried out on a helicopter model to illustrate the effectiveness of the proposed control design
UniWorld: Autonomous Driving Pre-training via World Models
In this paper, we draw inspiration from Alberto Elfes' pioneering work in
1989, where he introduced the concept of the occupancy grid as World Models for
robots. We imbue the robot with a spatial-temporal world model, termed
UniWorld, to perceive its surroundings and predict the future behavior of other
participants. UniWorld involves initially predicting 4D geometric occupancy as
the World Models for foundational stage and subsequently fine-tuning on
downstream tasks. UniWorld can estimate missing information concerning the
world state and predict plausible future states of the world. Besides,
UniWorld's pre-training process is label-free, enabling the utilization of
massive amounts of image-LiDAR pairs to build a Foundational Model.The proposed
unified pre-training framework demonstrates promising results in key tasks such
as motion prediction, multi-camera 3D object detection, and surrounding
semantic scene completion. When compared to monocular pre-training methods on
the nuScenes dataset, UniWorld shows a significant improvement of about 1.5% in
IoU for motion prediction, 2.0% in mAP and 2.0% in NDS for multi-camera 3D
object detection, as well as a 3% increase in mIoU for surrounding semantic
scene completion. By adopting our unified pre-training method, a 25% reduction
in 3D training annotation costs can be achieved, offering significant practical
value for the implementation of real-world autonomous driving. Codes are
publicly available at https://github.com/chaytonmin/UniWorld.Comment: 8 pages, 5 figures. arXiv admin note: substantial text overlap with
arXiv:2305.1882
Volume Transfer: A New Design Concept for Fabric-Based Pneumatic Exosuits
The fabric-based pneumatic exosuit is now a hot research topic because it is
lighter and softer than traditional exoskeletons. Existing research focused
more on the mechanical properties of the exosuit (e.g., torque and speed), but
less on its wearability (e.g., appearance and comfort). This work presents a
new design concept for fabric-based pneumatic exosuits Volume Transfer, which
means transferring the volume of pneumatic actuators beyond the garments
profile to the inside. This allows for a concealed appearance and a larger
stress area while maintaining adequate torques. In order to verify this
concept, we develop a fabric-based pneumatic exosuit for knee extension
assistance. Its profile is only 26mm and its stress area wraps around almost
half of the leg. We use a mathematical model and simulation to determine the
parameters of the exosuit, avoiding multiple iterations of the prototype.
Experiment results show that the exosuit can generate a torque of 7.6Nm at a
pressure of 90kPa and produce a significant reduction in the electromyography
activity of the knee extensor muscles. We believe that Volume Transfer could be
utilized prevalently in future fabric-based pneumatic exosuit designs to
achieve a significant improvement in wearability
Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders
Current perception models in autonomous driving heavily rely on large-scale
labelled 3D data, which is both costly and time-consuming to annotate. This
work proposes a solution to reduce the dependence on labelled 3D training data
by leveraging pre-training on large-scale unlabeled outdoor LiDAR point clouds
using masked autoencoders (MAE). While existing masked point autoencoding
methods mainly focus on small-scale indoor point clouds or pillar-based
large-scale outdoor LiDAR data, our approach introduces a new self-supervised
masked occupancy pre-training method called Occupancy-MAE, specifically
designed for voxel-based large-scale outdoor LiDAR point clouds. Occupancy-MAE
takes advantage of the gradually sparse voxel occupancy structure of outdoor
LiDAR point clouds and incorporates a range-aware random masking strategy and a
pretext task of occupancy prediction. By randomly masking voxels based on their
distance to the LiDAR and predicting the masked occupancy structure of the
entire 3D surrounding scene, Occupancy-MAE encourages the extraction of
high-level semantic information to reconstruct the masked voxel using only a
small number of visible voxels. Extensive experiments demonstrate the
effectiveness of Occupancy-MAE across several downstream tasks. For 3D object
detection, Occupancy-MAE reduces the labelled data required for car detection
on the KITTI dataset by half and improves small object detection by
approximately 2% in AP on the Waymo dataset. For 3D semantic segmentation,
Occupancy-MAE outperforms training from scratch by around 2% in mIoU. For
multi-object tracking, Occupancy-MAE enhances training from scratch by
approximately 1% in terms of AMOTA and AMOTP. Codes are publicly available at
https://github.com/chaytonmin/Occupancy-MAE.Comment: Accepted by TI
Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction
Multi-camera 3D perception has emerged as a prominent research field in
autonomous driving, offering a viable and cost-effective alternative to
LiDAR-based solutions. However, existing multi-camera algorithms primarily rely
on monocular image pre-training, which overlooks the spatial and temporal
correlations among different camera views. To address this limitation, we
propose a novel multi-camera unified pre-training framework called Occ-BEV,
which involves initially reconstructing the 3D scene as the foundational stage
and subsequently fine-tuning the model on downstream tasks. Specifically, a 3D
decoder is designed for leveraging Bird's Eye View (BEV) features from
multi-view images to predict the 3D geometry occupancy to enable the model to
capture a more comprehensive understanding of the 3D environment. One
significant advantage of Occ-BEV is that it can utilize a vast amount of
unlabeled image-LiDAR pairs for pre-training. The proposed multi-camera unified
pre-training framework demonstrates promising results in key tasks such as
multi-camera 3D object detection and semantic scene completion. When compared
to monocular pre-training methods on the nuScenes dataset, Occ-BEV demonstrates
a significant improvement of 2.0% in mAP and 2.0% in NDS for 3D object
detection, as well as a 0.8% increase in mIOU for semantic scene completion.
codes are publicly available at https://github.com/chaytonmin/Occ-BEV.Comment: 8 pages, 5 figure
- …