31 research outputs found

    Expanding Language-Image Pretrained Models for General Video Recognition

    Full text link
    Contrastive language-image pretraining has shown great success in learning visual-textual joint representation from web-scale data, demonstrating remarkable "zero-shot" generalization ability for various image tasks. However, how to effectively expand such new language-image pretraining methods to video domains is still an open problem. In this work, we present a simple yet effective approach that adapts the pretrained language-image models to video recognition directly, instead of pretraining a new model from scratch. More concretely, to capture the long-range dependencies of frames along the temporal dimension, we propose a cross-frame attention mechanism that explicitly exchanges information across frames. Such module is lightweight and can be plugged into pretrained language-image models seamlessly. Moreover, we propose a video-specific prompting scheme, which leverages video content information for generating discriminative textual prompts. Extensive experiments demonstrate that our approach is effective and can be generalized to different video recognition scenarios. In particular, under fully-supervised settings, our approach achieves a top-1 accuracy of 87.1% on Kinectics-400, while using 12 times fewer FLOPs compared with Swin-L and ViViT-H. In zero-shot experiments, our approach surpasses the current state-of-the-art methods by +7.6% and +14.9% in terms of top-1 accuracy under two popular protocols. In few-shot scenarios, our approach outperforms previous best methods by +32.1% and +23.1% when the labeled data is extremely limited. Code and models are available at https://aka.ms/X-CLIPComment: Accepted by ECCV2022, Ora

    Monitoring the Severity of Pantana phyllostachysae Chao Infestation in Moso Bamboo Forests Based on UAV Multi-Spectral Remote Sensing Feature Selection

    Get PDF
    In recent years, the rapid development of unmanned aerial vehicle (UAV) remote sensing technology has provided a new means to efficiently monitor forest resources and effectively prevent and control pests and diseases. This study aims to develop a detection model to study the damage caused to Moso bamboo forests by Pantana phyllostachysae Chao (PPC), a major leaf-eating pest, at 5 cm resolution. Damage sensitive features were extracted from multispectral images acquired by UAVs and used to train detection models based on support vector machines (SVM), random forests (RF), and extreme gradient boosting tree (XGBoost) machine learning algorithms. The overall detection accuracy (OA) and Kappa coefficient of SVM, RF, and XGBoost were 81.95%, 0.733, 85.71%, 0.805, and 86.47%, 0.811, respectively. Meanwhile, the detection accuracies of SVM, RF, and XGBoost were 78.26%, 76.19%, and 80.95% for healthy, 75.00%, 83.87%, and 79.17% for mild damage, 83.33%, 86.49%, and 85.00% for moderate damage, and 82.5%, 90.91%, and 93.75% for severe damage Moso bamboo, respectively. Overall, XGBoost exhibited the best detection performance, followed by RF and SVM. Thus, the study findings provide a technical reference for the regional monitoring and control of PPC in Moso bamboo

    Investigation of the Wind-Direction Effect on Buoyancy-Driven Fire Smoke Dispersion in an Urban Street Canyon

    No full text
    When a fire occurs in a street canyon, smoke recirculation is the most harmful factor to human beings inside the canyon, while the wind condition is an essential factor determining if the smoke is recirculated. This paper focuses on the wind direction’s effect on buoyancy-driven fire smoke dispersion in a street canyon, which is innovative research since the effect of wind direction has not been reported before. In this study, an ideal street canyon model with a height–width ratio of 1 was established, and both the wind velocity and wind direction were changed to search for the critical point at which smoke recirculation occurs. The results show that with an increase in the wind direction angle (the angle of wind towards the direction of the street width), the smoke recirculation could be distinguished into three regimes, i.e., the “fully re-circulation stage”, the “semi re-circulation stage”, and the “non-recirculation stage”. The critical recirculation velocity was increased with the increase in the wind direction angle, and new models regarding the critical wind velocity and the Froude number were proposed for different wind direction conditions

    Multi-level perception fusion dehazing network

    No full text

    Pontin Acts as a Potential Biomarker for Poor Clinical Outcome and Promotes Tumor Invasion in Hilar Cholangiocarcinoma

    No full text
    Hilar cholangiocarcinoma (HC) is a devastating malignancy that carries a poor overall prognosis. As a member of the AAA+ superfamily, Pontin becomes highly expressed in several malignant tumors, which contributes to tumor progression and influences tumor prognosis. In our research, Pontin expression in tumor specimens resected from 86 HC patients was detected by immunohistochemistry. Interestingly, high expression of Pontin was significantly associated with lymph node metastasis (p=0.011) and tumor node metastasis (TNM) stage (p=0.005). The Kaplan–Meier overall survival rate and multivariate analyses were performed to evaluate the prognosis of patients with HC. Patients with high Pontin expression had significantly poorer overall survival outcomes. Multivariate analyses found that Pontin was an independent prognostic factor (p=0.001). Moreover, bioinformatics analysis confirmed the increase in Pontin mRNA expression levels in cholangiocarcinoma tissues. In addition, in vitro experiments showed that Pontin expression was inhibited at the mRNA as well as protein levels after transfection with Pontin siRNA in human cholangiocarcinoma cell lines. Moreover, significant suppression of cell invasion was observed after the downregulation of Pontin. Taken together, the present study suggested that Pontin could act as a potential prognostic predictor, which might be a new valuable molecular candidate for the prevention and treatment of HC

    Mapping of bamboo forest bright and shadow areas using optical and SAR satellite data in Google Earth Engine

    No full text
    Bamboo groves predominantly thrive in tropical or subtropical regions. Assessing the efficacy of remote sensing data of various types in extracting bamboo forest information from bright and shadow areas is a critical issue for achieving precise identification of bamboo forests in complex terrain. In this study, 34 features were obtained from Sentinel-1 SAR and Sentinel-2 optical images using the Google Earth Engine platform. The normalized shaded vegetation index (NSVI) was then employed to segment the bright and shadow woodlands. Different features from diverse data sources were evaluated to extract bamboo forest information in the bright and shadow areas, then use the random forest (RF) classification algorithm to extract bamboo forest. The results showed that (1) the red-edge and short-wave infrared bands of Sentinel-2 optical images and their corresponding vegetation indices are significant in bamboo forest information extraction. (2) The dissimilarity and homogeneity of Sentinel-2 texture features in the bright area and dissimilarity in the shadow area, the Sentinel-1 backscatter features in the bright area and the VV and VH in the bright area and VV-VH in the shadow area have some variability between bamboo and nonbamboo forests, which can be used as effective features for bamboo forest extraction. (3) The combination of spectral, texture and backscatter features yields the highest overall classification accuracy and Kappa coefficient, at 87.96% and 0.7435, respectively. This study has the potential for remote sensing refinement of bamboo forest identification in complex terrain areas by utilizing subregion classification methods combined with optical and radar image features

    Monitoring the Severity of Pantana phyllostachysae Chao Infestation in Moso Bamboo Forests Based on UAV Multi-Spectral Remote Sensing Feature Selection

    No full text
    In recent years, the rapid development of unmanned aerial vehicle (UAV) remote sensing technology has provided a new means to efficiently monitor forest resources and effectively prevent and control pests and diseases. This study aims to develop a detection model to study the damage caused to Moso bamboo forests by Pantana phyllostachysae Chao (PPC), a major leaf-eating pest, at 5 cm resolution. Damage sensitive features were extracted from multispectral images acquired by UAVs and used to train detection models based on support vector machines (SVM), random forests (RF), and extreme gradient boosting tree (XGBoost) machine learning algorithms. The overall detection accuracy (OA) and Kappa coefficient of SVM, RF, and XGBoost were 81.95%, 0.733, 85.71%, 0.805, and 86.47%, 0.811, respectively. Meanwhile, the detection accuracies of SVM, RF, and XGBoost were 78.26%, 76.19%, and 80.95% for healthy, 75.00%, 83.87%, and 79.17% for mild damage, 83.33%, 86.49%, and 85.00% for moderate damage, and 82.5%, 90.91%, and 93.75% for severe damage Moso bamboo, respectively. Overall, XGBoost exhibited the best detection performance, followed by RF and SVM. Thus, the study findings provide a technical reference for the regional monitoring and control of PPC in Moso bamboo
    corecore