13 research outputs found

    Learning Longterm Representations for Person Re-Identification Using Radio Signals

    Full text link
    Person Re-Identification (ReID) aims to recognize a person-of-interest across different places and times. Existing ReID methods rely on images or videos collected using RGB cameras. They extract appearance features like clothes, shoes, hair, etc. Such features, however, can change drastically from one day to the next, leading to inability to identify people over extended time periods. In this paper, we introduce RF-ReID, a novel approach that harnesses radio frequency (RF) signals for longterm person ReID. RF signals traverse clothes and reflect off the human body; thus they can be used to extract more persistent human-identifying features like body size and shape. We evaluate the performance of RF-ReID on longitudinal datasets that span days and weeks, where the person may wear different clothes across days. Our experiments demonstrate that RF-ReID outperforms state-of-the-art RGB-based ReID approaches for long term person ReID. Our results also reveal two interesting features: First since RF signals work in the presence of occlusions and poor lighting, RF-ReID allows for person ReID in such scenarios. Second, unlike photos and videos which reveal personal and private information, RF signals are more privacy-preserving, and hence can help extend person ReID to privacy-concerned domains, like healthcare.Comment: CVPR 2020. The first three authors contributed equally to this pape

    Probabilistic Radiomics: Ambiguous Diagnosis with Controllable Shape Analysis

    Full text link
    Radiomics analysis has achieved great success in recent years. However, conventional Radiomics analysis suffers from insufficiently expressive hand-crafted features. Recently, emerging deep learning techniques, e.g., convolutional neural networks (CNNs), dominate recent research in Computer-Aided Diagnosis (CADx). Unfortunately, as black-box predictors, we argue that CNNs are "diagnosing" voxels (or pixels), rather than lesions; in other words, visual saliency from a trained CNN is not necessarily concentrated on the lesions. On the other hand, classification in clinical applications suffers from inherent ambiguities: radiologists may produce diverse diagnosis on challenging cases. To this end, we propose a controllable and explainable {\em Probabilistic Radiomics} framework, by combining the Radiomics analysis and probabilistic deep learning. In our framework, 3D CNN feature is extracted upon lesion region only, then encoded into lesion representation, by a controllable Non-local Shape Analysis Module (NSAM) based on self-attention. Inspired from variational auto-encoders (VAEs), an Ambiguity PriorNet is used to approximate the ambiguity distribution over human experts. The final diagnosis is obtained by combining the ambiguity prior sample and lesion representation, and the whole network named DenseSharp+DenseSharp^{+} is end-to-end trainable. We apply the proposed method on lung nodule diagnosis on LIDC-IDRI database to validate its effectiveness.Comment: MICCAI 2019 (early accept), with supplementary material

    InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation

    Full text link
    Empowering models to dynamically accomplish tasks specified through natural language instructions represents a promising path toward more capable and general artificial intelligence. In this work, we introduce InstructSeq, an instruction-conditioned multi-modal modeling framework that unifies diverse vision tasks through flexible natural language control and handling of both visual and textual data. InstructSeq employs a multimodal transformer architecture encompassing visual, language, and sequential modeling. We utilize a visual encoder to extract image features and a text encoder to encode instructions. An autoregressive transformer fuses the representations and generates sequential task outputs. By training with LLM-generated natural language instructions, InstructSeq acquires a strong comprehension of free-form instructions for specifying visual tasks. This provides an intuitive interface for directing capabilities using flexible natural instructions. Without any task-specific tuning, InstructSeq achieves compelling performance on semantic segmentation, referring expression segmentation/comprehension, and image captioning. The flexible control and multi-task unification empower the model with more human-like versatility and generalizability for computer vision. The code will be released soon at https://github.com/rongyaofang/InstructSeq.Comment: 10 page

    Learning Longterm Representations for Person Re-Identification Using Radio Signals

    No full text
    Person Re-Identification (ReID) aims to recognize a person-of-interest across different places and times. Existing ReID methods rely on images or videos collected using RGB cameras. They extract appearance features like clothes, shoes, hair, etc. Such features, however, can change drastically from one day to the next, leading to inability to identify people over extended time periods. In this paper, we introduce RF-ReID, a novel approach that harnesses radio frequency (RF) signals for longterm person ReID. RF signals traverse clothes and reflect off the human body; thus they can be used to extract more persistent human-identifying features like body size and shape. We evaluate the performance of RF-ReID on longitudinal datasets that span days and weeks, where the person may wear different clothes across days. Our experiments demonstrate that RF-ReID outperforms state-of-the-art RGB-based ReID approaches for long term person ReID. Our results also reveal two interesting features: First since RF signals work in the presence of occlusions and poor lighting, RF-ReID allows for person ReID in such scenarios. Second, unlike photos and videos which reveal personal and private information, RF signals are more privacy-preserving, and hence can help extend person ReID to privacy-concerned domains, like healthcare

    FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

    Full text link
    In this study, we delve into the generation of high-resolution images from pre-trained diffusion models, addressing persistent challenges, such as repetitive patterns and structural distortions, that emerge when models are applied beyond their trained resolutions. To address this issue, we introduce an innovative, training-free approach FouriScale from the perspective of frequency domain analysis. We replace the original convolutional layers in pre-trained diffusion models by incorporating a dilation technique along with a low-pass operation, intending to achieve structural consistency and scale consistency across resolutions, respectively. Further enhanced by a padding-then-crop strategy, our method can flexibly handle text-to-image generation of various aspect ratios. By using the FouriScale as guidance, our method successfully balances the structural integrity and fidelity of generated images, achieving an astonishing capacity of arbitrary-size, high-resolution, and high-quality generation. With its simplicity and compatibility, our method can provide valuable insights for future explorations into the synthesis of ultra-high-resolution images. The code will be released at https://github.com/LeonHLJ/FouriScale

    A Novel Deep-Trench Super-Junction SiC MOSFET with Improved Specific On-Resistance

    No full text
    In this paper, a novel 4H-SiC deep-trench super-junction MOSFET (Metal-Oxide-Semiconductor Field-Effect Transistor) with a split-gate is proposed and theoretically verified by Sentaurus TCAD simulations. A deep trench filled with P-poly-Si combined with the P-SiC region leads to a charge balance effect. Instead of a full-SiC P region in conventional super-junction MOSFET, this new structure reduces the P region in a super-junction MOSFET, thus helping to lower the specific on-resistance. As a result, the figure of merit (FoM, BV2/Ron,sp) of the proposed new structure is 642% and 39.65% higher than the C-MOS and the SJ-MOS, respectively
    corecore