26 research outputs found
Learning Open-vocabulary Semantic Segmentation Models From Natural Language Supervision
In this paper, we consider the problem of open-vocabulary semantic
segmentation (OVS), which aims to segment objects of arbitrary classes instead
of pre-defined, closed-set categories. The main contributions are as follows:
First, we propose a transformer-based model for OVS, termed as OVSegmentor,
which only exploits web-crawled image-text pairs for pre-training without using
any mask annotations. OVSegmentor assembles the image pixels into a set of
learnable group tokens via a slot-attention based binding module, and aligns
the group tokens to the corresponding caption embedding. Second, we propose two
proxy tasks for training, namely masked entity completion and cross-image mask
consistency. The former aims to infer all masked entities in the caption given
the group tokens, that enables the model to learn fine-grained alignment
between visual groups and text entities. The latter enforces consistent mask
predictions between images that contain shared entities, which encourages the
model to learn visual invariance. Third, we construct CC4M dataset for
pre-training by filtering CC12M with frequently appeared entities, which
significantly improves training efficiency. Fourth, we perform zero-shot
transfer on three benchmark datasets, PASCAL VOC 2012, PASCAL Context, and COCO
Object. Our model achieves superior segmentation results over the
state-of-the-art method by using only 3\% data (4M vs 134M) for pre-training.
Code and pre-trained models will be released for future research
Query-dominant User Interest Network for Large-Scale Search Ranking
Historical behaviors have shown great effect and potential in various
prediction tasks, including recommendation and information retrieval. The
overall historical behaviors are various but noisy while search behaviors are
always sparse. Most existing approaches in personalized search ranking adopt
the sparse search behaviors to learn representation with bottleneck, which do
not sufficiently exploit the crucial long-term interest. In fact, there is no
doubt that user long-term interest is various but noisy for instant search, and
how to exploit it well still remains an open problem.
To tackle this problem, in this work, we propose a novel model named
Query-dominant user Interest Network (QIN), including two cascade units to
filter the raw user behaviors and reweigh the behavior subsequences.
Specifically, we propose a relevance search unit (RSU), which aims to search a
subsequence relevant to the query first and then search the sub-subsequences
relevant to the target item. These items are then fed into an attention unit
called Fused Attention Unit (FAU). It should be able to calculate attention
scores from the ID field and attribute field separately, and then adaptively
fuse the item embedding and content embedding based on the user engagement of
past period. Extensive experiments and ablation studies on real-world datasets
demonstrate the superiority of our model over state-of-the-art methods. The QIN
now has been successfully deployed on Kuaishou search, an online video search
platform, and obtained 7.6% improvement on CTR.Comment: 10 page
DRAC: Diabetic Retinopathy Analysis Challenge with Ultra-Wide Optical Coherence Tomography Angiography Images
Computer-assisted automatic analysis of diabetic retinopathy (DR) is of great
importance in reducing the risks of vision loss and even blindness. Ultra-wide
optical coherence tomography angiography (UW-OCTA) is a non-invasive and safe
imaging modality in DR diagnosis system, but there is a lack of publicly
available benchmarks for model development and evaluation. To promote further
research and scientific benchmarking for diabetic retinopathy analysis using
UW-OCTA images, we organized a challenge named "DRAC - Diabetic Retinopathy
Analysis Challenge" in conjunction with the 25th International Conference on
Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). The
challenge consists of three tasks: segmentation of DR lesions, image quality
assessment and DR grading. The scientific community responded positively to the
challenge, with 11, 12, and 13 teams from geographically diverse institutes
submitting different solutions in these three tasks, respectively. This paper
presents a summary and analysis of the top-performing solutions and results for
each task of the challenge. The obtained results from top algorithms indicate
the importance of data augmentation, model architecture and ensemble of
networks in improving the performance of deep learning models. These findings
have the potential to enable new developments in diabetic retinopathy analysis.
The challenge remains open for post-challenge registrations and submissions for
benchmarking future methodology developments
FDVTS's Solution for 2nd COV19D Competition on COVID-19 Detection and Severity Analysis
This paper presents our solution for the 2nd COVID-19 Competition, occurring
in the framework of the AIMIA Workshop in the European Conference on Computer
Vision (ECCV 2022). In our approach, we employ an effective 3D Contrastive
Mixup Classification network for COVID-19 diagnosis on chest CT images, which
is composed of contrastive representation learning and mixup classification.
For the COVID-19 detection challenge, our approach reaches 0.9245 macro F1
score on 484 validation CT scans, which significantly outperforms the baseline
method by 16.5%. In the COVID-19 severity detection challenge, our approach
achieves 0.7186 macro F1 score on 61 validation samples, which also surpasses
the baseline by 8.86%
The USGT Method for Suspender Tensioning of Self-Anchored Suspension Bridges
Unlike earth-anchored suspension bridges, self-anchored suspension bridges (SASBs) involve a special construction stage, namely, suspender tensioning, in which the tensioning force and sequence are crucial and complicated. Against this background, an example bridge A, a SASB with a steel-concrete composite beam, is introduced in detail. Using MIDAS finite element software, a suspender tensioning scheme is formulated based on a combination method of the unstrained state method and graded tension method (the USGT method), in which a suspender is tensioned according to its unstrained length. By analyzing the bending moment change of the beam and deflection of the main cable throughout the entire construction process, a âhigh-to-lowâ suspender tensioning sequence is proposed that also involves symmetrical tensioning from the main towers to the midspan or the anchor positions. In the optimized construction process, the deviation and stress of the main towers are controlled well, thereby ensuring the safety of the main beam and main towers in the construction process
Vertical Features of Volatile Organic Compounds and Their Potential Photochemical Reactivities in Boundary Layer Revealed by In-Situ Observations and Satellite Retrieval
Based on in-situ vertical observations of volatile organic compounds (VOCs) in the lower troposphere (0â1.0 km) in Nanjing, China, during the summer and autumn, we analyzed the VOCs vertical profiles, diurnal variation, and their impact factors in meteorology and photochemistry. The results showed that almost all the concentrations of VOC species decreased with height, similar to the profiles of primary air pollutants, as expected. However, we found the ratios of inactive species (e.g., acetylene) and secondary VOCs (e.g., ketones and aldehydes) in total VOCs (TVOCs) increased with height. Combined with satellite-retrieved data, we found the average HCHO tropospheric column concentrations were 2.0 times higher in the summer than in the autumn. While the average of tropospheric NO2 column concentrations was 3.0 times lower in the summer than in the autumn, the seasonal differences in the ratio of oxygenated VOCs (OVOCs) to NO2 (e.g., HCHO/NO2) shown in TROPOMI satellite-retrieved data were consistent with in-situ observations (e.g., acetone/NO2). On average, during autumn daytime, the mixing layer (ML), stable boundary layer (SBL), and residual layer (RL) had OH loss rates (LOH) of 6.9, 6.3, and 5.5 sâ1, respectively. The LOH of alkenes was the largest in the ML, while the LOH of aromatics was the largest in the SBL and RL. At autumn night, the NO3 loss rates (LNO3) in the SBL and RL were 2.0 Ă 10â2 and 1.6 Ă 10â2 sâ1, respectively, and the LNO3 of aromatics was the largest in the SBL and RL. In the daytime of summer, the LOH of VOCs was ~40% lower than that in autumn in all layers, while there was no significant difference in LNO3 at night between the two seasons. This study provides data support and a theoretical basis for VOC composite pollution control in the Nanjing region
Capability of phytoremediation of glyphosate in environment by Vulpia myuros
Glyphosate is an herbicide extensively used worldwide that can remain in the soil. Phytoremediation to decontaminate polluted water or soil requires a plant that can accumulate the target compound. Vulpia myuros is an annual fescue that can be used as a heavy mental phytoremediation strategy. Recently, it has been used to intercrop with tea plant to prohibit the germination and growth of other weeds in tea garden. In order to know whether it can be used an decontaminating glyphosateâ plant in water or soil, in this study, glyphosate degradation behavior was investigated in Vulpia myuros cultivated in a hydroponic system. The results showed that the concentration of glyphosate in the nutrient solution decreased from 43.09 Όg mLâ1 to 0.45 Όg mLâ1 in 30 days and that 99% of the glyphosate molecules were absorbed by V. myuros. The contents of glyphosate in the roots reached the maximum (224.33 mg kgâ1) on day 1 and then decreased. After 3 days, the content of glyphosate in the leaves reached the highest value (215.64 mg kgâ1), while it decreased to 156.26 mg kgâ1 in the roots. The dissipation dynamics of glyphosate in the whole hydroponic system fits the first-order kinetic model C = 455.76eâ0.21 t, with a half-life of 5.08 days. Over 30 days, 80% of the glyphosate was degraded. The contents of the glyphosate metabolite amino methyl phosphoric acid (AMPA), ranged from 0.103 mg kgâ1 on day 1â0.098 mg kgâ1 on day 30, not changing significantly over time. The Croot/solution, Cleaf/solution and Cleaf/root were used to express the absorption, transfer, and distribution of glyphosate in V. myuros. These results indicated that glyphosate entered into the root system through free diffusion, which was influenced by both the log Kow and the concentration of glyphosate in the nutrient solution, and that glyphosate was either easily transferred to the leaves through the transpiration stream, accumulated, or degraded. The degradation of glyphosate in V. myuros indicated that it has potential as a remediating plant for environmental restoration