13 research outputs found
Learning Longterm Representations for Person Re-Identification Using Radio Signals
Person Re-Identification (ReID) aims to recognize a person-of-interest across
different places and times. Existing ReID methods rely on images or videos
collected using RGB cameras. They extract appearance features like clothes,
shoes, hair, etc. Such features, however, can change drastically from one day
to the next, leading to inability to identify people over extended time
periods. In this paper, we introduce RF-ReID, a novel approach that harnesses
radio frequency (RF) signals for longterm person ReID. RF signals traverse
clothes and reflect off the human body; thus they can be used to extract more
persistent human-identifying features like body size and shape. We evaluate the
performance of RF-ReID on longitudinal datasets that span days and weeks, where
the person may wear different clothes across days. Our experiments demonstrate
that RF-ReID outperforms state-of-the-art RGB-based ReID approaches for long
term person ReID. Our results also reveal two interesting features: First since
RF signals work in the presence of occlusions and poor lighting, RF-ReID allows
for person ReID in such scenarios. Second, unlike photos and videos which
reveal personal and private information, RF signals are more
privacy-preserving, and hence can help extend person ReID to privacy-concerned
domains, like healthcare.Comment: CVPR 2020. The first three authors contributed equally to this pape
Probabilistic Radiomics: Ambiguous Diagnosis with Controllable Shape Analysis
Radiomics analysis has achieved great success in recent years. However,
conventional Radiomics analysis suffers from insufficiently expressive
hand-crafted features. Recently, emerging deep learning techniques, e.g.,
convolutional neural networks (CNNs), dominate recent research in
Computer-Aided Diagnosis (CADx). Unfortunately, as black-box predictors, we
argue that CNNs are "diagnosing" voxels (or pixels), rather than lesions; in
other words, visual saliency from a trained CNN is not necessarily concentrated
on the lesions. On the other hand, classification in clinical applications
suffers from inherent ambiguities: radiologists may produce diverse diagnosis
on challenging cases. To this end, we propose a controllable and explainable
{\em Probabilistic Radiomics} framework, by combining the Radiomics analysis
and probabilistic deep learning. In our framework, 3D CNN feature is extracted
upon lesion region only, then encoded into lesion representation, by a
controllable Non-local Shape Analysis Module (NSAM) based on self-attention.
Inspired from variational auto-encoders (VAEs), an Ambiguity PriorNet is used
to approximate the ambiguity distribution over human experts. The final
diagnosis is obtained by combining the ambiguity prior sample and lesion
representation, and the whole network named is end-to-end
trainable. We apply the proposed method on lung nodule diagnosis on LIDC-IDRI
database to validate its effectiveness.Comment: MICCAI 2019 (early accept), with supplementary material
InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Empowering models to dynamically accomplish tasks specified through natural
language instructions represents a promising path toward more capable and
general artificial intelligence. In this work, we introduce InstructSeq, an
instruction-conditioned multi-modal modeling framework that unifies diverse
vision tasks through flexible natural language control and handling of both
visual and textual data. InstructSeq employs a multimodal transformer
architecture encompassing visual, language, and sequential modeling. We utilize
a visual encoder to extract image features and a text encoder to encode
instructions. An autoregressive transformer fuses the representations and
generates sequential task outputs. By training with LLM-generated natural
language instructions, InstructSeq acquires a strong comprehension of free-form
instructions for specifying visual tasks. This provides an intuitive interface
for directing capabilities using flexible natural instructions. Without any
task-specific tuning, InstructSeq achieves compelling performance on semantic
segmentation, referring expression segmentation/comprehension, and image
captioning. The flexible control and multi-task unification empower the model
with more human-like versatility and generalizability for computer vision. The
code will be released soon at https://github.com/rongyaofang/InstructSeq.Comment: 10 page
A photo-initiated pulsed oxygen-iodine chemical Laser utilizing CF3 I or Cu3 I as iodine atom donors
Learning Longterm Representations for Person Re-Identification Using Radio Signals
Person Re-Identification (ReID) aims to recognize a person-of-interest across different places and times. Existing ReID methods rely on images or videos collected using RGB cameras. They extract appearance features like clothes, shoes, hair, etc. Such features, however, can change drastically from one day to the next, leading to inability to identify people over extended time periods. In this paper, we introduce RF-ReID, a novel approach that harnesses radio frequency (RF) signals for longterm person ReID. RF signals traverse clothes and reflect off the human body; thus they can be used to extract more persistent human-identifying features like body size and shape. We evaluate the performance of RF-ReID on longitudinal datasets that span days and weeks, where the person may wear different clothes across days. Our experiments demonstrate that RF-ReID outperforms state-of-the-art RGB-based ReID approaches for long term person ReID. Our results also reveal two interesting features: First since RF signals work in the presence of occlusions and poor lighting, RF-ReID allows for person ReID in such scenarios. Second, unlike photos and videos which reveal personal and private information, RF signals are more privacy-preserving, and hence can help extend person ReID to privacy-concerned domains, like healthcare
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
In this study, we delve into the generation of high-resolution images from
pre-trained diffusion models, addressing persistent challenges, such as
repetitive patterns and structural distortions, that emerge when models are
applied beyond their trained resolutions. To address this issue, we introduce
an innovative, training-free approach FouriScale from the perspective of
frequency domain analysis. We replace the original convolutional layers in
pre-trained diffusion models by incorporating a dilation technique along with a
low-pass operation, intending to achieve structural consistency and scale
consistency across resolutions, respectively. Further enhanced by a
padding-then-crop strategy, our method can flexibly handle text-to-image
generation of various aspect ratios. By using the FouriScale as guidance, our
method successfully balances the structural integrity and fidelity of generated
images, achieving an astonishing capacity of arbitrary-size, high-resolution,
and high-quality generation. With its simplicity and compatibility, our method
can provide valuable insights for future explorations into the synthesis of
ultra-high-resolution images. The code will be released at
https://github.com/LeonHLJ/FouriScale
A Novel Deep-Trench Super-Junction SiC MOSFET with Improved Specific On-Resistance
In this paper, a novel 4H-SiC deep-trench super-junction MOSFET (Metal-Oxide-Semiconductor Field-Effect Transistor) with a split-gate is proposed and theoretically verified by Sentaurus TCAD simulations. A deep trench filled with P-poly-Si combined with the P-SiC region leads to a charge balance effect. Instead of a full-SiC P region in conventional super-junction MOSFET, this new structure reduces the P region in a super-junction MOSFET, thus helping to lower the specific on-resistance. As a result, the figure of merit (FoM, BV2/Ron,sp) of the proposed new structure is 642% and 39.65% higher than the C-MOS and the SJ-MOS, respectively