65 research outputs found

    Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision

    Full text link
    In this paper, we consider the problem of open-vocabulary semantic segmentation (OVS), which aims to segment objects of arbitrary classes instead of pre-defined, closed-set categories. The main contributions are as follows: First, we propose a transformer-based model for OVS, termed as OVSegmentor, which only exploits web-crawled image-text pairs for pre-training without using any mask annotations. OVSegmentor assembles the image pixels into a set of learnable group tokens via a slot-attention based binding module, and aligns the group tokens to the corresponding caption embedding. Second, we propose two proxy tasks for training, namely masked entity completion and cross-image mask consistency. The former aims to infer all masked entities in the caption given the group tokens, that enables the model to learn fine-grained alignment between visual groups and text entities. The latter enforces consistent mask predictions between images that contain shared entities, which encourages the model to learn visual invariance. Third, we construct CC4M dataset for pre-training by filtering CC12M with frequently appeared entities, which significantly improves training efficiency. Fourth, we perform zero-shot transfer on three benchmark datasets, PASCAL VOC 2012, PASCAL Context, and COCO Object. Our model achieves superior segmentation results over the state-of-the-art method by using only 3\% data (4M vs 134M) for pre-training. Code and pre-trained models will be released for future research

    Exploring the impact of the digital economy on green total factor productivity in China: A spatial econometric perspective

    Get PDF
    The digital economy is considered a driving force of green economic development. However, only a few studies have examined the relationship between the digital economy and green total factor productivity (GTFP). According to the principal component method and super-efficient Slacks-based measure model, the digital economy level and green total factor productivity GTFP were measured for China’s provinces based on panel data from 2013 to 2019. The spatial econometric model was then used to analyze the effects of the digital economy level on green total factor productivity GTFP. Results showed that the overall level of green total factor productivity GTFP maintained a steady growth trend, with an average yearly growth of 4.19%. Significant regional differences reflecting the development characteristics of eastern, central, and western regions were also observed. Most provinces showed either high or low values of both green total factor productivity GTFP and digital economic development thereby revealing spatial heterogeneity for the different provinces and cities. The spatial Durbin model showed that the digital economy had a significant direct effect (0.1498) and spatial spillover effect (0.3438) on green total factor productivity GTFP, the latter being greater than the former, with this conclusion supported by the robustness test. Technological innovation positively regulates the contribution of the region’s digital economy to green total factor productivity GTFP and negatively regulates the spatial spillover of the digital economy to green total factor productivity GTFP in neighboring regions

    VideoLLM: Modeling Video Sequence with Large Language Models

    Full text link
    With the exponential growth of video data, there is an urgent need for automated technology to analyze and comprehend video content. However, existing video understanding models are often task-specific and lack a comprehensive capability of handling diverse tasks. The success of large language models (LLMs) like GPT has demonstrated their impressive abilities in sequence causal reasoning. Building upon this insight, we propose a novel framework called VideoLLM that leverages the sequence reasoning capabilities of pre-trained LLMs from natural language processing (NLP) for video sequence understanding. VideoLLM incorporates a carefully designed Modality Encoder and Semantic Translator, which convert inputs from various modalities into a unified token sequence. This token sequence is then fed into a decoder-only LLM. Subsequently, with the aid of a simple task head, our VideoLLM yields an effective unified framework for different kinds of video understanding tasks. To evaluate the efficacy of VideoLLM, we conduct extensive experiments using multiple LLMs and fine-tuning methods. We evaluate our VideoLLM on eight tasks sourced from four different datasets. The experimental results demonstrate that the understanding and reasoning capabilities of LLMs can be effectively transferred to video understanding tasks. We release the code at https://github.com/cg1177/VideoLLM.Comment: Technical Repor

    Genetic characterization of Measles Viruses in China, 2004

    Get PDF
    Genetic characterization of wild-type measles virus was studied using nucleotide sequencing of the C-terminal region of the N protein gene and phylogenetic analysis on 59 isolates from 16 provinces of China in 2004. The results showed that all of the isolates belonged to genotype H1. 51 isolates were belonged to cluster 1 and 8 isolates were cluster 2 and Viruses from both clusters were distributed throughout China without distinct geographic pattern. The nucleotide sequence and predicted amino acid homologies of the 59 H1 strains were 96.5%–100% and 95.7%–100%, respectively. The report showed that the transmission pattern of genotype H1 viruses in China in 2004 was consistent with ongoing endemic transmission of multiple lineages of a single, endemic genotype. Multiple transmission pathways leaded to multiple lineages within endemic genotype

    Molecular epidemiology of measles viruses in China, 1995–2003

    Get PDF
    This report describes the genetic characterization of 297 wild-type measles viruses that were isolated in 24 provinces of China between 1995 and 2003. Phylogenetic analysis of the N gene sequences showed that all of the isolates belonged to genotype H1 except 3 isolates, which were genotype A. The nucleotide sequence and predicted amino acid homologies of the 294-genotype H1 strains were 94.7%–100% and 93.3%–100%, respectively. The genotype H1 isolates were divided into 2 clusters, which differed by approximately 2.9% at the nucleotide level. Viruses from both clusters were distributed throughout China with no apparent geographic restriction and multiple co-circulating lineages were present in many provinces. Even though other measles genotypes have been detected in countries that border China, this report shows that genotype H1 is widely distributed throughout the country and that China has a single, endemic genotype. This important baseline data will help to monitor the progress of measles control in China

    The use of cessation assistance among smokers from China: Findings from the ITC China Survey

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Stop smoking medications significantly increase the likelihood of smoking cessation. However, there are no population-based studies of stop-smoking medication use in China, the largest tobacco market in the world. This study examined stop-smoking medication use and its association with quitting behavior among a population-based sample of Chinese smokers.</p> <p>Methods</p> <p>Face-to-face interviews were conducted with 4,627 smokers from six cities in the ITC China cohort survey. Longitudinal analyses were conducted using Wave 1 (April to August, 2006) and Wave 2 (November 2007 to January 2008).</p> <p>Results</p> <p>Approximately 26% of smokers had attempted to quit between Waves 1 and 2, and 6% were abstinent at 18-month follow-up. Only 5.8% of those attempting to quit reported NRT use and NRT was associated with lower odds of abstinence at Wave 2 (OR = 0.11; 95%CI = 0.03-0.46). Visiting a doctor/health professional was associated with greater attempts to quit smoking (OR = 1.60 and 2.78; 95%CI = 1.22-2.10 and 2.21-3.49 respectively) and being abstinent (OR = 1.77 and 1.85; 95%CI = 1.18-2.66 and 1.13-3.04 respectively) at 18-month follow-up relative to the smokers who did not visit doctor/health professional.</p> <p>Conclusions</p> <p>The use of formal help for smoking cessation is low in China. There is an urgent need to explore the use and effectiveness of stop-smoking medications in China and in other non-Western markets.</p

    Differentially expressed genes in a flock of Chinese local-breed chickens infected with a subgroup J avian leukosis virus using suppression subtractive hybridization

    Get PDF
    Avian leukosis virus subgroup J (ALV-J) is a new type of virus that mainly induces myeloid leukosis (ML) in chickens. To further elucidate the pathogenesis of ALV-J infection and tumor development, expression profiles from the bone marrow tissue of 15 infected and 18 non-infected birds from a local-breed poultry-farm under naturally infected conditions, were analyzed by suppression-subtractive hybridization. The birds were diagnosed as ML+ (or ML-) by specific ALV-J detection methods, involving serological tests for antigens and antibodies, and RT-PCR to detect viral RNA. A total of 59 partial gene sequences were revealed by differential screening of 496 forward and 384 reverse subtracted cDNA clones. Of these, 22 identified genes, including 8 up-regulated and 14 down-regulated, were related to immune functions, these genes being, MHC B-G antigen, translationally-controlled tumor protein (TPT1/TPTC), transferrin and ferritin, hemoglobin and Carbonic anhydrase. Four of the down-regulated genes were selected for further analysis, in view of their predicted roles in infection and immunity by real-time qRT-PCR, using RNA collected from the same birds as those used for SSH. The four genes were expressed at significantly lower levels (p < 0.001) in ALV-J infected birds than in non-infected ones

    FDVTS's Solution for 2nd COV19D Competition on COVID-19 Detection and Severity Analysis

    Full text link
    This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop in the European Conference on Computer Vision (ECCV 2022). In our approach, we employ an effective 3D Contrastive Mixup Classification network for COVID-19 diagnosis on chest CT images, which is composed of contrastive representation learning and mixup classification. For the COVID-19 detection challenge, our approach reaches 0.9245 macro F1 score on 484 validation CT scans, which significantly outperforms the baseline method by 16.5%. In the COVID-19 severity detection challenge, our approach achieves 0.7186 macro F1 score on 61 validation samples, which also surpasses the baseline by 8.86%

    Enantio- and diastereoselective synthesis of chromeno[4,3-b]pyrrole derivatives bearing tetrasubstituted chirality centers through carbene catalyzed cascade reactions

    No full text
    An NHC-catalyzed cascade cycloaddition reaction is developed for quick access to structurally sophisticated tetrahydrochromeno[4,3-b]pyrrole derivatives. A sterically congested tetrasubstituted chirality carbon center is formed during the cyclization process. All the α-, β-, and carbonyl carbons of the enal substrates are functionalized in chemo- and stereoselective fashion. The multicyclic chromeno[4,3-b]pyrrole products are generally afforded in good yields with excellent enantio- and diastereoselectivities. Heavily substituted pyrroline derivatives can be afforded from the chiral products through simple protocols.NRF (Natl Research Foundation, S’pore)ASTAR (Agency for Sci., Tech. and Research, S’pore)MOE (Min. of Education, S’pore)Accepted versio

    CREAM: Weakly Supervised Object Localization via Class RE-Activation Mapping

    Full text link
    Weakly Supervised Object Localization (WSOL) aims to localize objects with image-level supervision. Existing works mainly rely on Class Activation Mapping (CAM) derived from a classification model. However, CAM-based methods usually focus on the most discriminative parts of an object (i.e., incomplete localization problem). In this paper, we empirically prove that this problem is associated with the mixup of the activation values between less discriminative foreground regions and the background. To address it, we propose Class RE-Activation Mapping (CREAM), a novel clustering-based approach to boost the activation values of the integral object regions. To this end, we introduce class-specific foreground and background context embeddings as cluster centroids. A CAM-guided momentum preservation strategy is developed to learn the context embeddings during training. At the inference stage, the re-activation mapping is formulated as a parameter estimation problem under Gaussian Mixture Model, which can be solved by deriving an unsupervised Expectation-Maximization based soft-clustering algorithm. By simply integrating CREAM into various WSOL approaches, our method significantly improves their performance. CREAM achieves the state-of-the-art performance on CUB, ILSVRC and OpenImages benchmark datasets. Code will be available at https://github.com/Jazzcharles/CREAM.Comment: 10 pages, CVPR 202
    corecore