65 research outputs found
Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
In this paper, we consider the problem of open-vocabulary semantic
segmentation (OVS), which aims to segment objects of arbitrary classes instead
of pre-defined, closed-set categories. The main contributions are as follows:
First, we propose a transformer-based model for OVS, termed as OVSegmentor,
which only exploits web-crawled image-text pairs for pre-training without using
any mask annotations. OVSegmentor assembles the image pixels into a set of
learnable group tokens via a slot-attention based binding module, and aligns
the group tokens to the corresponding caption embedding. Second, we propose two
proxy tasks for training, namely masked entity completion and cross-image mask
consistency. The former aims to infer all masked entities in the caption given
the group tokens, that enables the model to learn fine-grained alignment
between visual groups and text entities. The latter enforces consistent mask
predictions between images that contain shared entities, which encourages the
model to learn visual invariance. Third, we construct CC4M dataset for
pre-training by filtering CC12M with frequently appeared entities, which
significantly improves training efficiency. Fourth, we perform zero-shot
transfer on three benchmark datasets, PASCAL VOC 2012, PASCAL Context, and COCO
Object. Our model achieves superior segmentation results over the
state-of-the-art method by using only 3\% data (4M vs 134M) for pre-training.
Code and pre-trained models will be released for future research
Exploring the impact of the digital economy on green total factor productivity in China: A spatial econometric perspective
The digital economy is considered a driving force of green economic development. However, only a few studies have examined the relationship between the digital economy and green total factor productivity (GTFP). According to the principal component method and super-efficient Slacks-based measure model, the digital economy level and green total factor productivity GTFP were measured for China’s provinces based on panel data from 2013 to 2019. The spatial econometric model was then used to analyze the effects of the digital economy level on green total factor productivity GTFP. Results showed that the overall level of green total factor productivity GTFP maintained a steady growth trend, with an average yearly growth of 4.19%. Significant regional differences reflecting the development characteristics of eastern, central, and western regions were also observed. Most provinces showed either high or low values of both green total factor productivity GTFP and digital economic development thereby revealing spatial heterogeneity for the different provinces and cities. The spatial Durbin model showed that the digital economy had a significant direct effect (0.1498) and spatial spillover effect (0.3438) on green total factor productivity GTFP, the latter being greater than the former, with this conclusion supported by the robustness test. Technological innovation positively regulates the contribution of the region’s digital economy to green total factor productivity GTFP and negatively regulates the spatial spillover of the digital economy to green total factor productivity GTFP in neighboring regions
VideoLLM: Modeling Video Sequence with Large Language Models
With the exponential growth of video data, there is an urgent need for
automated technology to analyze and comprehend video content. However, existing
video understanding models are often task-specific and lack a comprehensive
capability of handling diverse tasks. The success of large language models
(LLMs) like GPT has demonstrated their impressive abilities in sequence causal
reasoning. Building upon this insight, we propose a novel framework called
VideoLLM that leverages the sequence reasoning capabilities of pre-trained LLMs
from natural language processing (NLP) for video sequence understanding.
VideoLLM incorporates a carefully designed Modality Encoder and Semantic
Translator, which convert inputs from various modalities into a unified token
sequence. This token sequence is then fed into a decoder-only LLM.
Subsequently, with the aid of a simple task head, our VideoLLM yields an
effective unified framework for different kinds of video understanding tasks.
To evaluate the efficacy of VideoLLM, we conduct extensive experiments using
multiple LLMs and fine-tuning methods. We evaluate our VideoLLM on eight tasks
sourced from four different datasets. The experimental results demonstrate that
the understanding and reasoning capabilities of LLMs can be effectively
transferred to video understanding tasks. We release the code at
https://github.com/cg1177/VideoLLM.Comment: Technical Repor
Genetic characterization of Measles Viruses in China, 2004
Genetic characterization of wild-type measles virus was studied using nucleotide sequencing of the C-terminal region of the N protein gene and phylogenetic analysis on 59 isolates from 16 provinces of China in 2004. The results showed that all of the isolates belonged to genotype H1. 51 isolates were belonged to cluster 1 and 8 isolates were cluster 2 and Viruses from both clusters were distributed throughout China without distinct geographic pattern. The nucleotide sequence and predicted amino acid homologies of the 59 H1 strains were 96.5%–100% and 95.7%–100%, respectively. The report showed that the transmission pattern of genotype H1 viruses in China in 2004 was consistent with ongoing endemic transmission of multiple lineages of a single, endemic genotype. Multiple transmission pathways leaded to multiple lineages within endemic genotype
Molecular epidemiology of measles viruses in China, 1995–2003
This report describes the genetic characterization of 297 wild-type measles viruses that were isolated in 24 provinces of China between 1995 and 2003. Phylogenetic analysis of the N gene sequences showed that all of the isolates belonged to genotype H1 except 3 isolates, which were genotype A. The nucleotide sequence and predicted amino acid homologies of the 294-genotype H1 strains were 94.7%–100% and 93.3%–100%, respectively. The genotype H1 isolates were divided into 2 clusters, which differed by approximately 2.9% at the nucleotide level. Viruses from both clusters were distributed throughout China with no apparent geographic restriction and multiple co-circulating lineages were present in many provinces. Even though other measles genotypes have been detected in countries that border China, this report shows that genotype H1 is widely distributed throughout the country and that China has a single, endemic genotype. This important baseline data will help to monitor the progress of measles control in China
The use of cessation assistance among smokers from China: Findings from the ITC China Survey
<p>Abstract</p> <p>Background</p> <p>Stop smoking medications significantly increase the likelihood of smoking cessation. However, there are no population-based studies of stop-smoking medication use in China, the largest tobacco market in the world. This study examined stop-smoking medication use and its association with quitting behavior among a population-based sample of Chinese smokers.</p> <p>Methods</p> <p>Face-to-face interviews were conducted with 4,627 smokers from six cities in the ITC China cohort survey. Longitudinal analyses were conducted using Wave 1 (April to August, 2006) and Wave 2 (November 2007 to January 2008).</p> <p>Results</p> <p>Approximately 26% of smokers had attempted to quit between Waves 1 and 2, and 6% were abstinent at 18-month follow-up. Only 5.8% of those attempting to quit reported NRT use and NRT was associated with lower odds of abstinence at Wave 2 (OR = 0.11; 95%CI = 0.03-0.46). Visiting a doctor/health professional was associated with greater attempts to quit smoking (OR = 1.60 and 2.78; 95%CI = 1.22-2.10 and 2.21-3.49 respectively) and being abstinent (OR = 1.77 and 1.85; 95%CI = 1.18-2.66 and 1.13-3.04 respectively) at 18-month follow-up relative to the smokers who did not visit doctor/health professional.</p> <p>Conclusions</p> <p>The use of formal help for smoking cessation is low in China. There is an urgent need to explore the use and effectiveness of stop-smoking medications in China and in other non-Western markets.</p
Differentially expressed genes in a flock of Chinese local-breed chickens infected with a subgroup J avian leukosis virus using suppression subtractive hybridization
Avian leukosis virus subgroup J (ALV-J) is a new type of virus that mainly induces myeloid leukosis (ML) in chickens. To further elucidate the pathogenesis of ALV-J infection and tumor development, expression profiles from the bone marrow tissue of 15 infected and 18 non-infected birds from a local-breed poultry-farm under naturally infected conditions, were analyzed by suppression-subtractive hybridization. The birds were diagnosed as ML+ (or ML-) by specific ALV-J detection methods, involving serological tests for antigens and antibodies, and RT-PCR to detect viral RNA. A total of 59 partial gene sequences were revealed by differential screening of 496 forward and 384 reverse subtracted cDNA clones. Of these, 22 identified genes, including 8 up-regulated and 14 down-regulated, were related to immune functions, these genes being, MHC B-G antigen, translationally-controlled tumor protein (TPT1/TPTC), transferrin and ferritin, hemoglobin and Carbonic anhydrase. Four of the down-regulated genes were selected for further analysis, in view of their predicted roles in infection and immunity by real-time qRT-PCR, using RNA collected from the same birds as those used for SSH. The four genes were expressed at significantly lower levels (p < 0.001) in ALV-J infected birds than in non-infected ones
FDVTS's Solution for 2nd COV19D Competition on COVID-19 Detection and Severity Analysis
This paper presents our solution for the 2nd COVID-19 Competition, occurring
in the framework of the AIMIA Workshop in the European Conference on Computer
Vision (ECCV 2022). In our approach, we employ an effective 3D Contrastive
Mixup Classification network for COVID-19 diagnosis on chest CT images, which
is composed of contrastive representation learning and mixup classification.
For the COVID-19 detection challenge, our approach reaches 0.9245 macro F1
score on 484 validation CT scans, which significantly outperforms the baseline
method by 16.5%. In the COVID-19 severity detection challenge, our approach
achieves 0.7186 macro F1 score on 61 validation samples, which also surpasses
the baseline by 8.86%
Enantio- and diastereoselective synthesis of chromeno[4,3-b]pyrrole derivatives bearing tetrasubstituted chirality centers through carbene catalyzed cascade reactions
An NHC-catalyzed cascade cycloaddition reaction is developed for quick access to structurally sophisticated tetrahydrochromeno[4,3-b]pyrrole derivatives. A sterically congested tetrasubstituted chirality carbon center is formed during the cyclization process. All the α-, β-, and carbonyl carbons of the enal substrates are functionalized in chemo- and stereoselective fashion. The multicyclic chromeno[4,3-b]pyrrole products are generally afforded in good yields with excellent enantio- and diastereoselectivities. Heavily substituted pyrroline derivatives can be afforded from the chiral products through simple protocols.NRF (Natl Research Foundation, S’pore)ASTAR (Agency for Sci., Tech. and Research, S’pore)MOE (Min. of Education, S’pore)Accepted versio
CREAM: Weakly Supervised Object Localization via Class RE-Activation Mapping
Weakly Supervised Object Localization (WSOL) aims to localize objects with
image-level supervision. Existing works mainly rely on Class Activation Mapping
(CAM) derived from a classification model. However, CAM-based methods usually
focus on the most discriminative parts of an object (i.e., incomplete
localization problem). In this paper, we empirically prove that this problem is
associated with the mixup of the activation values between less discriminative
foreground regions and the background. To address it, we propose Class
RE-Activation Mapping (CREAM), a novel clustering-based approach to boost the
activation values of the integral object regions. To this end, we introduce
class-specific foreground and background context embeddings as cluster
centroids. A CAM-guided momentum preservation strategy is developed to learn
the context embeddings during training. At the inference stage, the
re-activation mapping is formulated as a parameter estimation problem under
Gaussian Mixture Model, which can be solved by deriving an unsupervised
Expectation-Maximization based soft-clustering algorithm. By simply integrating
CREAM into various WSOL approaches, our method significantly improves their
performance. CREAM achieves the state-of-the-art performance on CUB, ILSVRC and
OpenImages benchmark datasets. Code will be available at
https://github.com/Jazzcharles/CREAM.Comment: 10 pages, CVPR 202
- …