224 research outputs found

    Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition

    Full text link
    This paper does not attempt to design a state-of-the-art method for visual recognition but investigates a more efficient way to make use of convolutions to encode spatial features. By comparing the design principles of the recent convolutional neural networks ConvNets) and Vision Transformers, we propose to simplify the self-attention by leveraging a convolutional modulation operation. We show that such a simple approach can better take advantage of the large kernels (>=7x7) nested in convolutional layers. We build a family of hierarchical ConvNets using the proposed convolutional modulation, termed Conv2Former. Our network is simple and easy to follow. Experiments show that our Conv2Former outperforms existent popular ConvNets and vision Transformers, like Swin Transformer and ConvNeXt in all ImageNet classification, COCO object detection and ADE20k semantic segmentation

    Delving Deeper into Data Scaling in Masked Image Modeling

    Full text link
    Understanding whether self-supervised learning methods can scale with unlimited data is crucial for training large-scale models. In this work, we conduct an empirical study on the scaling capability of masked image modeling (MIM) methods (e.g., MAE) for visual recognition. Unlike most previous works that depend on the widely-used ImageNet dataset, which is manually curated and object-centric, we take a step further and propose to investigate this problem in a more practical setting. Specifically, we utilize the web-collected Coyo-700M dataset. We randomly sample varying numbers of training images from the Coyo dataset and construct a series of sub-datasets, containing 0.5M, 1M, 5M, 10M, and 100M images, for pre-training. Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models. The study reveals that: 1) MIM can be viewed as an effective method to improve the model capacity when the scale of the training data is relatively small; 2) Strong reconstruction targets can endow the models with increased capacities on downstream tasks; 3) MIM pre-training is data-agnostic under most scenarios, which means that the strategy of sampling pre-training data is non-critical. We hope these observations could provide valuable insights for future research on MIM

    SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

    Full text link
    We present SegNeXt, a simple convolutional network architecture for semantic segmentation. Recent transformer-based models have dominated the field of semantic segmentation due to the efficiency of self-attention in encoding spatial information. In this paper, we show that convolutional attention is a more efficient and effective way to encode contextual information than the self-attention mechanism in transformers. By re-examining the characteristics owned by successful segmentation models, we discover several key components leading to the performance improvement of segmentation models. This motivates us to design a novel convolutional attention network that uses cheap convolutional operations. Without bells and whistles, our SegNeXt significantly improves the performance of previous state-of-the-art methods on popular benchmarks, including ADE20K, Cityscapes, COCO-Stuff, Pascal VOC, Pascal Context, and iSAID. Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90.6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it. On average, SegNeXt achieves about 2.0% mIoU improvements compared to the state-of-the-art methods on the ADE20K datasets with the same or fewer computations. Code is available at https://github.com/uyzhang/JSeg (Jittor) and https://github.com/Visual-Attention-Network/SegNeXt (Pytorch).Comment: SegNeXt, a simple CNN for semantic segmentation. Code is availabl

    Graphite Nanoeraser

    Full text link
    We present here a method for cleaning intermediate-size (5~50nm) contamination from highly oriented pyrolytic graphite. Electron beam deposition causes a continuous increase of carbonaceous material on graphene and graphite surfaces, which is difficult to remove by conventional techniques. Direct mechanical wiping using a graphite nanoeraser is observed to drastically reduce the amount of contamination. After the mechanical removal of contamination, the graphite surfaces were able to self-retract after shearing, indicating that van der Waals contact bonding is restored. Since contact bonding provides an indication of a level of cleanliness normally only attainable in a high-quality clean-room, we discuss potential applications in preparation of ultraclean surfaces.Comment: 10 pages, two figure

    The UDP-glucosyltransferase multigene family in Bombyx mori

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Glucosidation plays a major role in the inactivation and excretion of a great variety of both endogenous and exogenous compounds. A class of UDP-glycosyltransferases (UGTs) is involved in this process. Insect UGTs play important roles in several processes, including detoxication of substrates such as plant allelochemicals, cuticle formation, pigmentation, and olfaction. Identification and characterization of <it>Bombyx mori </it>UGT genes could provide valuable basic information for this important family and explain the detoxication mechanism and other processes in insects.</p> <p>Results</p> <p>Taking advantage of the newly assembled genome sequence, we performed a genome-wide analysis of the candidate UGT family in the silkworm, <it>B. mori</it>. Based on UGT signature and their similarity to UGT homologs from other organisms, we identified 42 putative silkworm UGT genes. Most of them are clustered on the silkworm chromosomes, with two major clusters on chromosomes 7 and 28, respectively. The phylogenetic analysis of these identified 42 UGT protein sequences revealed five major groups. A comparison of the silkworm UGTs with homologs from other sequenced insect genomes indicated that some UGTs are silkworm-specific genes. The expression patterns of these candidate genes were investigated with known expressed sequence tags (ESTs), microarray data, and RT-PCR method. In total, 36 genes were expressed in tissues examined and showed different patterns of expression profile, indicating that these UGT genes might have different functions.</p> <p>Conclusion</p> <p><it>B. mori </it>possesses a largest insect UGT gene family characterized to date, including 42 genes. Phylogenetic analysis, genomic organization and expression profiles provide an overview for the silkworm UGTs and facilitate their functional studies in future.</p
    corecore