51 research outputs found

    Cross-domain Few-shot Segmentation with Transductive Fine-tuning

    Full text link
    Few-shot segmentation (FSS) expects models trained on base classes to work on novel classes with the help of a few support images. However, when there exists a domain gap between the base and novel classes, the state-of-the-art FSS methods may even fail to segment simple objects. To improve their performance on unseen domains, we propose to transductively fine-tune the base model on a set of query images under the few-shot setting, where the core idea is to implicitly guide the segmentation of query images using support labels. Although different images are not directly comparable, their class-wise prototypes are desired to be aligned in the feature space. By aligning query and support prototypes with an uncertainty-aware contrastive loss, and using a supervised cross-entropy loss and an unsupervised boundary loss as regularizations, our method could generalize the base model to the target domain without additional labels. We conduct extensive experiments under various cross-domain settings of natural, remote sensing, and medical images. The results show that our method could consistently and significantly improve the performance of prototypical FSS models in all cross-domain tasks.Comment: 12 pages, 8 figure

    Learning Depth From Images

    Get PDF
    Estimating depth from images has become a very popular task in computer vision which aims to restore the 3D scene from 2D images and identify important geometric knowledge of the scene. Its performance has been significantly improved by convolutional neural networks in recent years, which surpass the traditional methods by a large margin. However, the natural scenes are usually complicated, and hard to build the correspondence between pixels across frames, such as the region containing moving objects, illumination changes, occlusions, and reflections. This research explores rich and comprehensive spatial correspondence across images and designs three new network architectures for depth estimation whose inputs can be a single image, stereo pairs, or monocular video. First, we propose a novel semantic stereo network named SSPCV-Net, which includes newly designed pyramid cost volumes for describing semantic and spatial correspondence on multiple levels. The semantic features are inferred from a semantic segmentation subnetwork while the spatial features are constructed by hierarchical spatial pooling. In the end, we design a 3D multi-cost aggregation module to integrate the extracted multilevel correspondence and perform regression for accurate disparity maps. We conduct comprehensive experiments and comparisons with some recent stereo matching networks on Scene Flow, KITTI 2015 and 2012, and Cityscapes benchmark datasets, and the results show that the proposed SSPCV-Net significantly promotes the state-of-the-art stereo-matching performance. Second, we present a novel SC-GAN network with end-to-end adversarial training for depth estimation from monocular videos without estimating the camera pose and pose change over time. To exploit cross-frame relations, SC-GAN includes a spatial correspondence module that uses Smolyak sparse grids to efficiently match the features across adjacent frames and an attention mechanism to learn the importance of features in different directions. Furthermore, the generator in SC-GAN learns to estimate depth from the input frames, while the discriminator learns to distinguish between the ground-truth and estimated depth map for the reference frame. Experiments on the KITTI and Cityscapes datasets show that the proposed SC-GAN can achieve much more accurate depth maps than many existing state-of-the-art methods on monocular videos. Finally, we propose a new method for single image depth estimation which utilize the spatial correspondence from stereo matching. To achieve the goal, we incorporate a pre-trained stereo network as a teacher to provide depth cues for the features and output generated by the student network which is a monocular depth estimation network. To further leverage the depth cues, we developed a new depth-aware convolution operation that can adaptively choose subsets of relevant features for convolutions at each location. Specifically, we compute hierarchical depth features as the guidance, and then estimate the depth map using such depth-aware convolution which can leverage the guidance to adapt the filters. Experimental results on the KITTI online benchmark and Eigen split datasets show that the proposed method achieves the state-of-the-art performance for single-image depth estimation

    Few-Shot 3D Point Cloud Semantic Segmentation via Stratified Class-Specific Attention Based Transformer Network

    Full text link
    3D point cloud semantic segmentation aims to group all points into different semantic categories, which benefits important applications such as point cloud scene reconstruction and understanding. Existing supervised point cloud semantic segmentation methods usually require large-scale annotated point clouds for training and cannot handle new categories. While a few-shot learning method was proposed recently to address these two problems, it suffers from high computational complexity caused by graph construction and inability to learn fine-grained relationships among points due to the use of pooling operations. In this paper, we further address these problems by developing a new multi-layer transformer network for few-shot point cloud semantic segmentation. In the proposed network, the query point cloud features are aggregated based on the class-specific support features in different scales. Without using pooling operations, our method makes full use of all pixel-level features from the support samples. By better leveraging the support features for few-shot learning, the proposed method achieves the new state-of-the-art performance, with 15\% less inference time, over existing few-shot 3D point cloud segmentation models on the S3DIS dataset and the ScanNet dataset

    ATLANTIS: A Benchmark for Semantic Segmentation of Waterbody Images

    Get PDF
    Vision-based semantic segmentation of waterbodies and nearby related objects provides important information for managing water resources and handling flooding emergency. However, the lack of large-scale labeled training and testing datasets for water-related categories prevents researchers from studying water-related issues in the computer vision field. To tackle this problem, we present ATLANTIS, a new benchmark for semantic segmentation of waterbodies and related objects. ATLANTIS consists of 5,195 images of waterbodies, as well as high quality pixel-level manual annotations of 56 classes of objects, including 17 classes of man-made objects, 18 classes of natural objects and 21 general classes. We analyze ATLANTIS in detail and evaluate several state-of-the-art semantic segmentation networks on our benchmark. In addition, a novel deep neural network, AQUANet, is developed for waterbody semantic segmentation by processing the aquatic and non-aquatic regions in two different paths. AQUANet also incorporates low-level feature modulation and cross-path modulation for enhancing feature representation. Experimental results show that the proposed AQUANet outperforms other state-of-the-art semantic segmentation networks on ATLANTIS. We claim that ATLANTIS is the largest waterbody image dataset for semantic segmentation providing a wide range of water and water-related classes and it will benefit researchers of both computer vision and water resources engineering

    Transient enhancement of magnetization damping in CoFeB film via pulsed laser excitation

    Get PDF
    Laser-induced spin dynamics of in-plane magnetized CoFeB films has been studied by using time-resolved magneto-optical Kerr effect measurements. While the effective demagnetization field shows little dependence on the pump laser fluence, the intrinsic damping constant has been found to be increased from 0.008 to 0.076 with the increase in the pump fluence from 2 mJ/cm2 to 20 mJ/cm2. This sharp enhancement has been shown to be transient and ascribed to the heating effect induced by the pump laser excitation, as the damping constant is almost unchanged when the pump-probe measurements are performed at a fixed pump fluence of 5 mJ/cm2 after irradiation by high power pump pulses

    Transient enhancement of magnetization damping in CoFeB film via pulsed laser excitation

    Get PDF
    Laser-induced spin dynamics of in-plane magnetized CoFeB films has been studied by using time-resolved magneto-optical Kerr effect measurements. While the effective demagnetization field shows little dependence on the pump laser fluence, the intrinsic damping constant has been found to be increased from 0.008 to 0.076 with the increase in the pump fluence from 2 mJ/cm2 to 20 mJ/cm2. This sharp enhancement has been shown to be transient and ascribed to the heating effect induced by the pump laser excitation, as the damping constant is almost unchanged when the pump-probe measurements are performed at a fixed pump fluence of 5 mJ/cm2 after irradiation by high power pump pulses

    Interface magnetic and electrical properties of CoFeB /InAs heterostructures

    Get PDF
    Amorphous magnetic CoFeB ultrathin films have been synthesized on the narrow band gap semiconductor InAs(100) surface, and the nature of the interface magnetic anisotropy and electrical contact has been studied. Angle-dependent hysteresis loops reveal that the films have an in-plane uniaxial magnetic anisotropy (UMA) with the easy axis along the InAs [0-11] crystal direction. The UMA was found to be dependent on the annealing temperatures of the substrates, which indicates the significant role of the Fe, Co-As bonding at the interface related to the surface condition of the InAs(100). I-V measurements show an ohmic contact interface between the CoFeB films and the InAs substrates, which is not affected by the surface condition of the InAs (100)

    Research on Government Compensation for Toll Road Public-Private Partnerships (PPP) Projects

    No full text
    With the continuous implementation of the PPP projects, due to the imperfection of relevant policies, the blindness of government subsidy is constantly emerging. Thus, it is important to put forward a practical approach for valuing the subsidies and risk. In this paper, the revenue subsidies model and traffic subsidies model are established. Then combined with practical cases, Monte Carlo simulation is used to get the value and probability of government subsidies under different compensation ways. On this basis, the influence of initial traffic volume and traffic growth rate on government subsidies and net present value of investors is analyzed. The research findings can provide theoretical guidance for the government to choose a reasonable way of subsidies, balance the risk of compensation, and formulate subsidies policies

    Parametric Surface Constrained Upsampler Network for Point Cloud

    No full text
    Designing a point cloud upsampler, which aims to generate a clean and dense point cloud given a sparse point representation, is a fundamental and challenging problem in computer vision. A line of attempts achieves this goal by establishing a point-to-point mapping function via deep neural networks. However, these approaches are prone to produce outlier points due to the lack of explicit surface-level constraints. To solve this problem, we introduce a novel surface regularizer into the upsampler network by forcing the neural network to learn the underlying parametric surface represented by bicubic functions and rotation functions, where the new generated points are then constrained on the underlying surface. These designs are integrated into two different networks for two tasks that take advantages of upsampling layers -- point cloud upsampling and point cloud completion for evaluation. The state-of-the-art experimental results on both tasks demonstrate the effectiveness of the proposed method. The implementation code will be available at https://github.com/corecai163/PSCU
    corecore