Search CORE

254 research outputs found

Measurement of Mercury in Flue Gas Based on an Aluminum Matrix Sorbent

Author: Wang Juan
Wang Wenhua
Wang Xiaohao
Xu Wei
Publication venue: TheScientificWorldJOURNAL
Publication date: 01/01/2011
Field of study

The measurement of total mercury in flue gas based on an economical aluminum matrix sorbent was developed in this paper. A sorbent trap consisted of three tubes was employed to capture Hg from flue gas. Hg trapped on sorbent was transferred into solution by acid leaching and then detected by CVAAS. Hg adsorbed on sorbent was recovered completely by leaching process. The 87.7% recovery of Hg in flue gas by tube 1 and tube 2 was obtained on the equipment of coal combustion and sampling in lab. In order to evaluate the ability to recover and accurately quantify Hg0 on the sorbent media, the analytical bias test on tube 3 spiked with Hg0 was also performed and got the average recovery of 97.1%. Mercury measurements based on this method were conducted for three coal-fired power plants in China. The mercury in coal is distributed into bottom ash, electrostatic precipitator (ESP) ash, wet flue gas desulfurization (WFGD) reactant, and flue gas, and the relative distribution varied depending on factors such as the coal type and the operation conditions of plants. The mercury mass balances of three plants were also calculated which were 91.6%, 77.1%, and 118%, respectively. The reliability of this method was verified by the Ontario Hydro (OH) method either in lab or in field

Crossref

PubMed Central

Online Video Instance Segmentation via Robust Context Fusion

Author: Li Xiang
Lu Yan
Raj Bhiksha
Wang Jinglu
Xu Xiaohao
Publication venue
Publication date: 12/07/2022
Field of study

Video instance segmentation (VIS) aims at classifying, segmenting and tracking object instances in video sequences. Recent transformer-based neural networks have demonstrated their powerful capability of modeling spatio-temporal correlations for the VIS task. Relying on video- or clip-level input, they suffer from high latency and computational cost. We propose a robust context fusion network to tackle VIS in an online fashion, which predicts instance segmentation frame-by-frame with a few preceding frames. To acquire the precise and temporal-consistent prediction for each frame efficiently, the key idea is to fuse effective and compact context from reference frames into the target frame. Considering the different effects of reference and target frames on the target prediction, we first summarize contextual features through importance-aware compression. A transformer encoder is adopted to fuse the compressed context. Then, we leverage an order-preserving instance embedding to convey the identity-aware information and correspond the identities to predicted instance masks. We demonstrate that our robust fusion network achieves the best performance among existing online VIS methods and is even better than previously published clip-level methods on the Youtube-VIS 2019 and 2021 benchmarks. In addition, visual objects often have acoustic signatures that are naturally synchronized with them in audio-bearing video recordings. By leveraging the flexibility of our context fusion network on multi-modal data, we further investigate the influence of audios on the video-dense prediction task, which has never been discussed in existing works. We build up an Audio-Visual Instance Segmentation dataset, and demonstrate that acoustic signals in the wild scenarios could benefit the VIS task

arXiv.org e-Print Archive

HR-Pro: Point-supervised Temporal Action Localization via Hierarchical Reliability Propagation

Author: Gao Changxin
Qing Zhiwu
Sang Nong
Wang Xiang
Xu Xiaohao
Zhang Huaxin
Publication venue
Publication date: 24/08/2023
Field of study

Point-supervised Temporal Action Localization (PSTAL) is an emerging research direction for label-efficient learning. However, current methods mainly focus on optimizing the network either at the snippet-level or the instance-level, neglecting the inherent reliability of point annotations at both levels. In this paper, we propose a Hierarchical Reliability Propagation (HR-Pro) framework, which consists of two reliability-aware stages: Snippet-level Discrimination Learning and Instance-level Completeness Learning, both stages explore the efficient propagation of high-confidence cues in point annotations. For snippet-level learning, we introduce an online-updated memory to store reliable snippet prototypes for each class. We then employ a Reliability-aware Attention Block to capture both intra-video and inter-video dependencies of snippets, resulting in more discriminative and robust snippet representation. For instance-level learning, we propose a point-based proposal generation approach as a means of connecting snippets and instances, which produces high-confidence proposals for further optimization at the instance level. Through multi-level reliability-aware learning, we obtain more reliable confidence scores and more accurate temporal boundaries of predicted proposals. Our HR-Pro achieves state-of-the-art performance on multiple challenging benchmarks, including an impressive average mAP of 60.3% on THUMOS14. Notably, our HR-Pro largely surpasses all previous point-supervised methods, and even outperforms several competitive fully supervised methods. Code will be available at https://github.com/pipixin321/HR-Pro.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

Towards Robust Referring Video Object Segmentation with Cyclic Relational Consensus

Author: Li Xiang
Li Xiao
Lu Yan
Raj Bhiksha
Wang Jinglu
Xu Xiaohao
Publication venue
Publication date: 18/08/2023
Field of study

Referring Video Object Segmentation (R-VOS) is a challenging task that aims to segment an object in a video based on a linguistic expression. Most existing R-VOS methods have a critical assumption: the object referred to must appear in the video. This assumption, which we refer to as semantic consensus, is often violated in real-world scenarios, where the expression may be queried against false videos. In this work, we highlight the need for a robust R-VOS model that can handle semantic mismatches. Accordingly, we propose an extended task called Robust R-VOS, which accepts unpaired video-text inputs. We tackle this problem by jointly modeling the primary R-VOS problem and its dual (text reconstruction). A structural text-to-text cycle constraint is introduced to discriminate semantic consensus between video-text pairs and impose it in positive pairs, thereby achieving multi-modal alignment from both positive and negative pairs. Our structural constraint effectively addresses the challenge posed by linguistic diversity, overcoming the limitations of previous methods that relied on the point-wise constraint. A new evaluation dataset, R\textsuperscript{2}-Youtube-VOSis constructed to measure the model robustness. Our model achieves state-of-the-art performance on R-VOS benchmarks, Ref-DAVIS17 and Ref-Youtube-VOS, and also our R\textsuperscript{2}-Youtube-VOS~dataset.Comment: iccv 2023, https://github.com/lxa9867/R2VO

arXiv.org e-Print Archive

Leveraging Multimodal Features and Item-level User Feedback for Bundle Construction

Author: Chua Tat-Seng
Liu Xiaohao
Ma Yunshan
Tao Zhulin
Wang Xiang
Wei Yinwei
Publication venue
Publication date: 28/10/2023
Field of study

Automatic bundle construction is a crucial prerequisite step in various bundle-aware online services. Previous approaches are mostly designed to model the bundling strategy of existing bundles. However, it is hard to acquire large-scale well-curated bundle dataset, especially for those platforms that have not offered bundle services before. Even for platforms with mature bundle services, there are still many items that are included in few or even zero bundles, which give rise to sparsity and cold-start challenges in the bundle construction models. To tackle these issues, we target at leveraging multimodal features, item-level user feedback signals, and the bundle composition information, to achieve a comprehensive formulation of bundle construction. Nevertheless, such formulation poses two new technical challenges: 1) how to learn effective representations by optimally unifying multiple features, and 2) how to address the problems of modality missing, noise, and sparsity problems induced by the incomplete query bundles. In this work, to address these technical challenges, we propose a Contrastive Learning-enhanced Hierarchical Encoder method (CLHE). Specifically, we use self-attention modules to combine the multimodal and multi-item features, and then leverage both item- and bundle-level contrastive learning to enhance the representation learning, thus to counter the modality missing, noise, and sparsity problems. Extensive experiments on four datasets in two application domains demonstrate that our method outperforms a list of SOTA methods. The code and dataset are available at https://github.com/Xiaohao-Liu/CLHE

arXiv.org e-Print Archive