86 research outputs found
Doubly Robust Estimation under Covariate-Induced Dependent Left Truncation
In prevalent cohort studies with follow-up, the time-to-event outcome is
subject to left truncation leading to selection bias. For estimation of the
distribution of time-to-event, conventional methods adjusting for left
truncation tend to rely on the (quasi-)independence assumption that the
truncation time and the event time are "independent" on the observed region.
This assumption is violated when there is dependence between the truncation
time and the event time possibly induced by measured covariates. Inverse
probability of truncation weighting leveraging covariate information can be
used in this case, but it is sensitive to misspecification of the truncation
model. In this work, we apply the semiparametric theory to find the efficient
influence curve of an expected (arbitrarily transformed) survival time in the
presence of covariate-induced dependent left truncation. We then use it to
construct estimators that are shown to enjoy double-robustness properties. Our
work represents the first attempt to construct doubly robust estimators in the
presence of left truncation, which does not fall under the established
framework of coarsened data where doubly robust approaches are developed. We
provide technical conditions for the asymptotic properties that appear to not
have been carefully examined in the literature for time-to-event data, and
study the estimators via extensive simulation. We apply the estimators to two
data sets from practice, with different right-censoring patterns
3D Random Occlusion and Multi-Layer Projection for Deep Multi-Camera Pedestrian Localization
Although deep-learning based methods for monocular pedestrian detection have
made great progress, they are still vulnerable to heavy occlusions. Using
multi-view information fusion is a potential solution but has limited
applications, due to the lack of annotated training samples in existing
multi-view datasets, which increases the risk of overfitting. To address this
problem, a data augmentation method is proposed to randomly generate 3D
cylinder occlusions, on the ground plane, which are of the average size of
pedestrians and projected to multiple views, to relieve the impact of
overfitting in the training. Moreover, the feature map of each view is
projected to multiple parallel planes at different heights, by using
homographies, which allows the CNNs to fully utilize the features across the
height of each pedestrian to infer the locations of pedestrians on the ground
plane. The proposed 3DROM method has a greatly improved performance in
comparison with the state-of-the-art deep-learning based methods for multi-view
pedestrian detection
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Offline reinforcement learning (RL) aims to find a near-optimal policy using
pre-collected datasets. In real-world scenarios, data collection could be
costly and risky; therefore, offline RL becomes particularly challenging when
the in-domain data is limited. Given recent advances in Large Language Models
(LLMs) and their few-shot learning prowess, this paper introduces
nguage Models for tion Control (), a
general framework based on Decision Transformers to effectively use pre-trained
Language Models (LMs) for offline RL. Our framework highlights four crucial
components: (1) Initializing Decision Transformers with sequentially
pre-trained LMs, (2) employing the LoRA fine-tuning method, in contrast to
full-weight fine-tuning, to combine the pre-trained knowledge from LMs and
in-domain knowledge effectively, (3) using the non-linear MLP transformation
instead of linear projections, to generate embeddings, and (4) integrating an
auxiliary language prediction loss during fine-tuning to stabilize the LMs and
retain their original abilities on languages. Empirical results indicate
achieves state-of-the-art performance in sparse-reward tasks
and closes the gap between value-based offline RL methods and decision
transformers in dense-reward tasks. In particular, our method demonstrates
superior performance in scenarios with limited data samples.Comment: 24 pages, 16 table
H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation
Human hands possess remarkable dexterity and have long served as a source of
inspiration for robotic manipulation. In this work, we propose a human
andformed visual representation learning framework to
solve difficult terous manipulation tasks ()
with reinforcement learning. Our framework consists of three stages: (i)
pre-training representations with 3D human hand pose estimation, (ii) offline
adapting representations with self-supervised keypoint detection, and (iii)
reinforcement learning with exponential moving average BatchNorm. The last two
stages only modify parameters of the pre-trained representation in
total, ensuring the knowledge from pre-training is maintained to the full
extent. We empirically study 12 challenging dexterous manipulation tasks and
find that H-InDex largely surpasses strong baseline methods and the recent
visual foundation models for motor control. Code is available at
https://yanjieze.com/H-InDex .Comment: NeurIPS 2023. Code and videos: https://yanjieze.com/H-InDe
Recommended from our members
Oscillation-specific nodal alterations in early to middle stages Parkinsons disease.
Background: Different oscillations of brain networks could carry different dimensions of brain integration. We aimed to investigate oscillation-specific nodal alterations in patients with Parkinsons disease (PD) across early stage to middle stage by using graph theory-based analysis. Methods: Eighty-eight PD patients including 39 PD patients in the early stage (EPD) and 49 patients in the middle stage (MPD) and 36 controls were recruited in the present study. Graph theory-based network analyses from three oscillation frequencies (slow-5: 0.01-0.027 Hz; slow-4: 0.027-0.073 Hz; slow-3: 0.073-0.198 Hz) were analyzed. Nodal metrics (e.g. nodal degree centrality, betweenness centrality and nodal efficiency) were calculated. Results: Our results showed that (1) a divergent effect of oscillation frequencies on nodal metrics, especially on nodal degree centrality and nodal efficiency, that the anteroventral neocortex and subcortex had high nodal metrics within low oscillation frequencies while the posterolateral neocortex had high values within the relative high oscillation frequency was observed, which visually showed that network was perturbed in PD; (2) PD patients in early stage relatively preserved nodal properties while MPD patients showed widespread abnormalities, which was consistently detected within all three oscillation frequencies; (3) the involvement of basal ganglia could be specifically observed within slow-5 oscillation frequency in MPD patients; (4) logistic regression and receiver operating characteristic curve analyses demonstrated that some of those oscillation-specific nodal alterations had the ability to well discriminate PD patients from controls or MPD from EPD patients at the individual level; (5) occipital disruption within high frequency (slow-3) made a significant influence on motor impairment which was dominated by akinesia and rigidity. Conclusions: Coupling various oscillations could provide potentially useful information for large-scale network and progressive oscillation-specific nodal alterations were observed in PD patients across early to middle stages
3D Random Occlusion and Multi-layer Projection for Deep Multi-camera Pedestrian Localization
Although deep-learning based methods for monocular pedestrian detection have made great progress, they are still vulnerable to heavy occlusions. Using multi-view information fusion is a potential solution but has limited applications, due to the lack of annotated training samples in existing multi-view datasets, which increases the risk of overfitting. To address this problem, a data augmentation method is proposed to randomly generate 3D cylinder occlusions, on the ground plane, which are of the average size of pedestrians and projected to multiple views, to relieve the impact of overfitting in the training. Moreover, the feature map of each view is projected to multiple parallel planes at different heights, by using homographies, which allows the CNNs to fully utilize the features across the height of each pedestrian to infer the locations of pedestrians on the ground plane. The proposed 3DROM method has a greatly improved performance in comparison with the state-of-the-art deep-learning based methods for multi-view pedestrian detection. Code is available at https://github.com/xjtlu-cvlab/3DROM
SCULPTOR: Skeleton-Consistent Face Creation Using a Learned Parametric Generator
Recent years have seen growing interest in 3D human faces modelling due to
its wide applications in digital human, character generation and animation.
Existing approaches overwhelmingly emphasized on modeling the exterior shapes,
textures and skin properties of faces, ignoring the inherent correlation
between inner skeletal structures and appearance. In this paper, we present
SCULPTOR, 3D face creations with Skeleton Consistency Using a Learned
Parametric facial generaTOR, aiming to facilitate easy creation of both
anatomically correct and visually convincing face models via a hybrid
parametric-physical representation. At the core of SCULPTOR is LUCY, the first
large-scale shape-skeleton face dataset in collaboration with plastic surgeons.
Named after the fossils of one of the oldest known human ancestors, our LUCY
dataset contains high-quality Computed Tomography (CT) scans of the complete
human head before and after orthognathic surgeries, critical for evaluating
surgery results. LUCY consists of 144 scans of 72 subjects (31 male and 41
female) where each subject has two CT scans taken pre- and post-orthognathic
operations. Based on our LUCY dataset, we learn a novel skeleton consistent
parametric facial generator, SCULPTOR, which can create the unique and nuanced
facial features that help define a character and at the same time maintain
physiological soundness. Our SCULPTOR jointly models the skull, face geometry
and face appearance under a unified data-driven framework, by separating the
depiction of a 3D face into shape blend shape, pose blend shape and facial
expression blend shape. SCULPTOR preserves both anatomic correctness and visual
realism in facial generation tasks compared with existing methods. Finally, we
showcase the robustness and effectiveness of SCULPTOR in various fancy
applications unseen before.Comment: 16 page, 13 fig
- …