707 research outputs found
Stress Hyperglycemia: A Problem that Cannot be Ignored
Stress hyperglycemia is a strong neuroendocrine reaction in thehypothalamic pituitary adrenal cortex under severe infection, trauma, burns,hemorrhage, surgery and other harmful stimulated, resulting in increasedsecretion of counter-regulatory hormones. These hormones promotedthe production of sugar and cause glucose metabolism disorders withcytokines and insulin resistance. In this condition, the production of sugarexceeds the utilization of sugar by the tissues, which eventually leads to anincrease in blood glucose levels in plasma. In the intensive care unit, stresshyperglycemia is very common and can occur in patients with or withoutdiabetes. The incidence is as high as 96%, and it is an independent factorin the death of critically ill patients. Hyperglycemia not only prolongsthe hospitalization time, mechanical ventilation time and increased theincidence of serious infections in critically ill patients, but can also leadto the occurrence of type 2 diabetes. Therefore, it is very important tolearn the pathological mechanism of stress hyperglycemia, the harm ofhyperglycemia and blood sugar management
Learning to Occlusion-Robustly Estimate 3-D States of Deformable Linear Objects from Single-Frame Point Clouds
Accurately and robustly estimating the state of deformable linear objects
(DLOs), such as ropes and wires, is crucial for DLO manipulation and other
applications. However, it remains a challenging open issue due to the high
dimensionality of the state space, frequent occlusion, and noises. This paper
focuses on learning to robustly estimate the states of DLOs from single-frame
point clouds in the presence of occlusions using a data-driven method. We
propose a novel two-branch network architecture to exploit global and local
information of input point cloud respectively and design a fusion module to
effectively leverage both the advantages. Simulation and real-world
experimental results demonstrate that our method can generate globally smooth
and locally precise DLO state estimation results even with heavily occluded
point clouds, which can be directly applied to real-world robotic manipulation
of DLOs in 3-D space.Comment: ICRA2023 submissio
Semiparametric proximal causal inference
Skepticism about the assumption of no unmeasured confounding, also known as
exchangeability, is often warranted in making causal inferences from
observational data; because exchangeability hinges on an investigator's ability
to accurately measure covariates that capture all potential sources of
confounding. In practice, the most one can hope for is that covariate
measurements are at best proxies of the true underlying confounding mechanism
operating in a given observational study. In this paper, we consider the
framework of proximal causal inference introduced by Tchetgen Tchetgen et al.
(2020), which while explicitly acknowledging covariate measurements as
imperfect proxies of confounding mechanisms, offers an opportunity to learn
about causal effects in settings where exchangeability on the basis of measured
covariates fails. We make a number of contributions to proximal inference
including (i) an alternative set of conditions for nonparametric proximal
identification of the average treatment effect; (ii) general semiparametric
theory for proximal estimation of the average treatment effect including
efficiency bounds for key semiparametric models of interest; (iii) a
characterization of proximal doubly robust and locally efficient estimators of
the average treatment effect. Moreover, we provide analogous identification and
efficiency results for the average treatment effect on the treated. Our
approach is illustrated via simulation studies and a data application on
evaluating the effectiveness of right heart catheterization in the intensive
care unit of critically ill patients
Fine-grained Recognition with Learnable Semantic Data Augmentation
Fine-grained image recognition is a longstanding computer vision challenge
that focuses on differentiating objects belonging to multiple subordinate
categories within the same meta-category. Since images belonging to the same
meta-category usually share similar visual appearances, mining discriminative
visual cues is the key to distinguishing fine-grained categories. Although
commonly used image-level data augmentation techniques have achieved great
success in generic image classification problems, they are rarely applied in
fine-grained scenarios, because their random editing-region behavior is prone
to destroy the discriminative visual cues residing in the subtle regions. In
this paper, we propose diversifying the training data at the feature-level to
alleviate the discriminative region loss problem. Specifically, we produce
diversified augmented samples by translating image features along semantically
meaningful directions. The semantic directions are estimated with a covariance
prediction network, which predicts a sample-wise covariance matrix to adapt to
the large intra-class variation inherent in fine-grained images. Furthermore,
the covariance prediction network is jointly optimized with the classification
network in a meta-learning manner to alleviate the degenerate solution problem.
Experiments on four competitive fine-grained recognition benchmarks
(CUB-200-2011, Stanford Cars, FGVC Aircrafts, NABirds) demonstrate that our
method significantly improves the generalization performance on several popular
classification networks (e.g., ResNets, DenseNets, EfficientNets, RegNets and
ViT). Combined with a recently proposed method, our semantic data augmentation
approach achieves state-of-the-art performance on the CUB-200-2011 dataset. The
source code will be released
Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Dynamic computation has emerged as a promising avenue to enhance the
inference efficiency of deep networks. It allows selective activation of
computational units, leading to a reduction in unnecessary computations for
each input sample. However, the actual efficiency of these dynamic models can
deviate from theoretical predictions. This mismatch arises from: 1) the lack of
a unified approach due to fragmented research; 2) the focus on algorithm design
over critical scheduling strategies, especially in CUDA-enabled GPU contexts;
and 3) challenges in measuring practical latency, given that most libraries
cater to static operations. Addressing these issues, we unveil the
Latency-Aware Unified Dynamic Networks (LAUDNet), a framework that integrates
three primary dynamic paradigms-spatially adaptive computation, dynamic layer
skipping, and dynamic channel skipping. To bridge the theoretical and practical
efficiency gap, LAUDNet merges algorithmic design with scheduling optimization,
guided by a latency predictor that accurately gauges dynamic operator latency.
We've tested LAUDNet across multiple vision tasks, demonstrating its capacity
to notably reduce the latency of models like ResNet-101 by over 50% on
platforms such as V100, RTX3090, and TX2 GPUs. Notably, LAUDNet stands out in
balancing accuracy and efficiency. Code is available at:
https://www.github.com/LeapLabTHU/LAUDNet
Rank-DETR for High Quality Object Detection
Modern detection transformers (DETRs) use a set of object queries to predict
a list of bounding boxes, sort them by their classification confidence scores,
and select the top-ranked predictions as the final detection results for the
given input image. A highly performant object detector requires accurate
ranking for the bounding box predictions. For DETR-based detectors, the
top-ranked bounding boxes suffer from less accurate localization quality due to
the misalignment between classification scores and localization accuracy, thus
impeding the construction of high-quality detectors. In this work, we introduce
a simple and highly performant DETR-based object detector by proposing a series
of rank-oriented designs, combinedly called Rank-DETR. Our key contributions
include: (i) a rank-oriented architecture design that can prompt positive
predictions and suppress the negative ones to ensure lower false positive
rates, as well as (ii) a rank-oriented loss function and matching cost design
that prioritizes predictions of more accurate localization accuracy during
ranking to boost the AP under high IoU thresholds. We apply our method to
improve the recent SOTA methods (e.g., H-DETR and DINO-DETR) and report strong
COCO object detection results when using different backbones such as
ResNet-, Swin-T, and Swin-L, demonstrating the effectiveness of our
approach. Code is available at \url{https://github.com/LeapLabTHU/Rank-DETR}.Comment: NeurIPS 202
Learning to Weight Samples for Dynamic Early-exiting Networks
Early exiting is an effective paradigm for improving the inference efficiency
of deep networks. By constructing classifiers with varying resource demands
(the exits), such networks allow easy samples to be output at early exits,
removing the need for executing deeper layers. While existing works mainly
focus on the architectural design of multi-exit networks, the training
strategies for such models are largely left unexplored. The current
state-of-the-art models treat all samples the same during training. However,
the early-exiting behavior during testing has been ignored, leading to a gap
between training and testing. In this paper, we propose to bridge this gap by
sample weighting. Intuitively, easy samples, which generally exit early in the
network during inference, should contribute more to training early classifiers.
The training of hard samples (mostly exit from deeper layers), however, should
be emphasized by the late classifiers. Our work proposes to adopt a weight
prediction network to weight the loss of different training samples at each
exit. This weight prediction network and the backbone model are jointly
optimized under a meta-learning framework with a novel optimization objective.
By bringing the adaptive behavior during inference into the training phase, we
show that the proposed weighting mechanism consistently improves the trade-off
between classification accuracy and inference efficiency. Code is available at
https://github.com/LeapLabTHU/L2W-DEN.Comment: ECCV 202
Adaptive Rotated Convolution for Rotated Object Detection
Rotated object detection aims to identify and locate objects in images with
arbitrary orientation. In this scenario, the oriented directions of objects
vary considerably across different images, while multiple orientations of
objects exist within an image. This intrinsic characteristic makes it
challenging for standard backbone networks to extract high-quality features of
these arbitrarily orientated objects. In this paper, we present Adaptive
Rotated Convolution (ARC) module to handle the aforementioned challenges. In
our ARC module, the convolution kernels rotate adaptively to extract object
features with varying orientations in different images, and an efficient
conditional computation mechanism is introduced to accommodate the large
orientation variations of objects within an image. The two designs work
seamlessly in rotated object detection problem. Moreover, ARC can conveniently
serve as a plug-and-play module in various vision backbones to boost their
representation ability to detect oriented objects accurately. Experiments on
commonly used benchmarks (DOTA and HRSC2016) demonstrate that equipped with our
proposed ARC module in the backbone network, the performance of multiple
popular oriented object detectors is significantly improved (e.g. +3.03% mAP on
Rotated RetinaNet and +4.16% on CFA). Combined with the highly competitive
method Oriented R-CNN, the proposed approach achieves state-of-the-art
performance on the DOTA dataset with 81.77% mAP
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Recently, diffusion models have made remarkable progress in text-to-image
(T2I) generation, synthesizing images with high fidelity and diverse contents.
Despite this advancement, latent space smoothness within diffusion models
remains largely unexplored. Smooth latent spaces ensure that a perturbation on
an input latent corresponds to a steady change in the output image. This
property proves beneficial in downstream tasks, including image interpolation,
inversion, and editing. In this work, we expose the non-smoothness of diffusion
latent spaces by observing noticeable visual fluctuations resulting from minor
latent variations. To tackle this issue, we propose Smooth Diffusion, a new
category of diffusion models that can be simultaneously high-performing and
smooth. Specifically, we introduce Step-wise Variation Regularization to
enforce the proportion between the variations of an arbitrary input latent and
that of the output image is a constant at any diffusion training step. In
addition, we devise an interpolation standard deviation (ISTD) metric to
effectively assess the latent space smoothness of a diffusion model. Extensive
quantitative and qualitative experiments demonstrate that Smooth Diffusion
stands out as a more desirable solution not only in T2I generation but also
across various downstream tasks. Smooth Diffusion is implemented as a
plug-and-play Smooth-LoRA to work with various community models. Code is
available at https://github.com/SHI-Labs/Smooth-Diffusion.Comment: GitHub: https://github.com/SHI-Labs/Smooth-Diffusio
- …