42 research outputs found

    USAGE: A Unified Seed Area Generation Paradigm for Weakly Supervised Semantic Segmentation

    Full text link
    Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS). Computing the Class Activation Map (CAM) from a multi-label classification network is the de facto paradigm for seed area generation, but CAMs generated from Convolutional Neural Networks (CNNs) and Transformers are prone to be under- and over-activated, respectively, which makes the strategies to refine CAMs for CNNs usually inappropriate for Transformers, and vice versa. In this paper, we propose a Unified optimization paradigm for Seed Area GEneration (USAGE) for both types of networks, in which the objective function to be optimized consists of two terms: One is a generation loss, which controls the shape of seed areas by a temperature parameter following a deterministic principle for different types of networks; The other is a regularization loss, which ensures the consistency between the seed areas that are generated by self-adaptive network adjustment from different views, to overturn false activation in seed areas. Experimental results show that USAGE consistently improves seed area generation for both CNNs and Transformers by large margins, e.g., outperforming state-of-the-art methods by a mIoU of 4.1% on PASCAL VOC. Moreover, based on the USAGE-generated seed areas on Transformers, we achieve state-of-the-art WSSS results on both PASCAL VOC and MS COCO

    BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data

    Full text link
    Compared with ample visual-text pre-training research, few works explore audio-text pre-training, mostly due to the lack of sufficient parallel audio-text data. Most existing methods incorporate the visual modality as a pivot for audio-text pre-training, which inevitably induces data noise. In this paper, we propose BLAT: Bootstrapping Language-Audio pre-training based on Tag-guided synthetic data. We utilize audio captioning to generate text directly from audio, without the aid of the visual modality so that potential noise from modality mismatch is eliminated. Furthermore, we propose caption generation under the guidance of AudioSet tags, leading to more accurate captions. With the above two improvements, we curate high-quality, large-scale parallel audio-text data, based on which we perform audio-text pre-training. Evaluation on a series of downstream tasks indicates that BLAT achieves SOTA zero-shot classification performance on most datasets and significant performance improvement when fine-tuned on downstream tasks, suggesting the effectiveness of our synthetic data

    Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN

    Full text link
    Masked image modeling (MIM), an emerging self-supervised pre-training method, has shown impressive success across numerous downstream vision tasks with Vision transformers (ViTs). Its underlying idea is simple: a portion of the input image is randomly masked out and then reconstructed via the pre-text task. However, the working principle behind MIM is not well explained, and previous studies insist that MIM primarily works for the Transformer family but is incompatible with CNNs. In this paper, we first study interactions among patches to understand what knowledge is learned and how it is acquired via the MIM task. We observe that MIM essentially teaches the model to learn better middle-order interactions among patches and extract more generalized features. Based on this fact, we propose an Architecture-Agnostic Masked Image Modeling framework (A2^2MIM), which is compatible with both Transformers and CNNs in a unified way. Extensive experiments on popular benchmarks show that our A2^2MIM learns better representations without explicit design and endows the backbone model with the stronger capability to transfer to various downstream tasks for both Transformers and CNNs.Comment: Preprint under review (update reversion). The source code will be released in https://github.com/Westlake-AI/openmixu

    A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

    Full text link
    The rapid development of deep learning has made a great progress in image segmentation, one of the fundamental tasks of computer vision. However, the current segmentation algorithms mostly rely on the availability of pixel-level annotations, which are often expensive, tedious, and laborious. To alleviate this burden, the past years have witnessed an increasing attention in building label-efficient, deep-learning-based image segmentation algorithms. This paper offers a comprehensive review on label-efficient image segmentation methods. To this end, we first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels (including no supervision, inexact supervision, incomplete supervision and inaccurate supervision) and supplemented by the types of segmentation problems (including semantic segmentation, instance segmentation and panoptic segmentation). Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation. Finally, we share our opinions about the future research directions for label-efficient deep image segmentation.Comment: Accepted to IEEE TPAM

    LncRNA-mediated cartilage homeostasis in osteoarthritis: a narrative review

    Get PDF
    Osteoarthritis (OA) is a degenerative disease of cartilage that affects the quality of life and has increased in morbidity and mortality in recent years. Cartilage homeostasis and dysregulation are thought to be important mechanisms involved in the development of OA. Many studies suggest that lncRNAs are involved in cartilage homeostasis in OA and that lncRNAs can be used to diagnose or treat OA. Among the existing therapeutic regimens, lncRNAs are involved in drug-and nondrug-mediated therapeutic mechanisms and are expected to improve the mechanism of adverse effects or drug resistance. Moreover, targeted lncRNA therapy may also prevent or treat OA. The purpose of this review is to summarize the links between lncRNAs and cartilage homeostasis in OA. In addition, we review the potential applications of lncRNAs at multiple levels of adjuvant and targeted therapies. This review highlights that targeting lncRNAs may be a novel therapeutic strategy for improving and modulating cartilage homeostasis in OA patients

    Latent Abnormal Pathology Affects Long-Term Graft Function in Elder Living Renal Allograft Recipients

    Get PDF
    Objective. This study evaluated the long-term effects and clinical significance of latent abnormal pathology on elder living donor kidney graft function after renal transplantation in China. Methods. One-hundred and thirty-eight living donor renal transplantations have been carried out at our hospital in recent years. Of these, 72 Time-Zero biopsies were performed and used in this analysis. Clinical data were retrospectively measured at 3, 6, 12, and 24 months after renal transplants. Relationships and effects from biopsy results taken from implanted donor kidney grafts were analyzed. Results. Time-Zero biopsy pathology results from donor kidneys showed that 48.61% of donor kidneys had latent abnormal changes; arterial lesions of donor kidneys had significant effects on the renal function of grafts after 2 years' transplantation; correlations between donor age and arterial lesions were significant; and Time-Zero biopsy pathology results could help predict the long-term function of a renal graft. Conclusions. Existing latent pathological changes of an elder living donor kidney before transplantation could affect long-term renal function. Whether a senior donor is used should be very carefully considered

    Fetal Brain Tissue Annotation and Segmentation Challenge Results

    Full text link
    In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, grey matter, white matter, ventricles, cerebellum, brainstem, deep grey matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the developing human brain in utero.Comment: Results from FeTA Challenge 2021, held at MICCAI; Manuscript submitte

    A New Flattened Cylinder Specimen for Direct Tensile Test of Rock

    No full text
    In recent decades, researchers have paid more attention to the indirect tensile test than to the direct tensile test (DTT) of rocks, mainly due to difficulties in the alignment and the stress concentration at the end of an intact cylindrical specimen. In this paper, a new flattened cylinder specimen and a clamp device were designed to obtain the true tensile strength of the rock in DTT. Stress distributions of the specimen with different lengths (l) and cutting thicknesses (t) were analyzed, and damage processes of the specimen were monitored by the Digital Image Correlation (DIC), the fractured sections were also scanned. Different mechanical parameters were also obtained by the DTT of the flattened cylinder specimens and the intact cylinder specimens, as well as the Brazilian disc. Research results show that the tensile strength obtained by DTT is smaller than the Brazilian disc and is slightly greater than the intact cylindrical specimen. The flattened cylinder specimen with 0.20 ≤ 2t/D < 0.68 and 0.10 ≤ l/D ≤ 0.20 is recommended to measure the true tensile strength of rock material in DTT. This new shape of the specimen is promising to be extended in the uniaxial or triaxial direct tension test

    Characteristic Analysis and Circuit Implementation of a Novel Fractional-Order Memristor-Based Clamping Voltage Drift

    No full text
    The ideal magnetic flux-controlled memristor was introduced into a four-dimensional chaotic system and combined with fractional calculus theory, and a novel four-dimensional commensurate fractional-order system was proposed and solved using the Adomian decomposition method. The system orders, parameters, and initial values were studied as independent variables in the bifurcation diagram and Lyapunov exponents spectrum, and it was discovered that changing these variables can cause the system to exhibit more complex and rich dynamical behaviors. The system had an offset boosting, which was discovered by adding a constant term after the decoupled linear term. Finally, the results of the numerical simulation were verified through the use of analog circuits and FPGA designs, and a control scheme for the system circuit was also suggested
    corecore