1,894 research outputs found
Boosting Factual Consistency and High Coverage in Unsupervised Abstractive Summarization
Abstractive summarization has gained attention because of the positive performance of large-scale, pretrained language models. However, models may generate a summary that contains information different from the original document. This phenomenon is particularly critical under the abstractive methods and is known as factual inconsistency. This study proposes an unsupervised abstractive method for improving factual consistency and coverage by adopting reinforcement learning. The proposed framework includes (1) a novel design to maintain factual consistency with an automatic question-answering process between the generated summary and original document, and (2) a novel method of ranking keywords based on word dependency, where keywords are used to examine the coverage of the key information preserved in the summary. The experimental results show that the proposed method outperforms the reinforcement learning baseline on both the evaluations for factual consistency and coverage
Project RISE: Recognizing Industrial Smoke Emissions
Industrial smoke emissions pose a significant concern to human health. Prior
works have shown that using Computer Vision (CV) techniques to identify smoke
as visual evidence can influence the attitude of regulators and empower
citizens to pursue environmental justice. However, existing datasets are not of
sufficient quality nor quantity to train the robust CV models needed to support
air quality advocacy. We introduce RISE, the first large-scale video dataset
for Recognizing Industrial Smoke Emissions. We adopted a citizen science
approach to collaborate with local community members to annotate whether a
video clip has smoke emissions. Our dataset contains 12,567 clips from 19
distinct views from cameras that monitored three industrial facilities. These
daytime clips span 30 days over two years, including all four seasons. We ran
experiments using deep neural networks to establish a strong performance
baseline and reveal smoke recognition challenges. Our survey study discussed
community feedback, and our data analysis displayed opportunities for
integrating citizen scientists and crowd workers into the application of
Artificial Intelligence for social good.Comment: Technical repor
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video
Dense object counting or crowd counting has come a long way thanks to the
recent development in the vision community. However, indiscernible object
counting, which aims to count the number of targets that are blended with
respect to their surroundings, has been a challenge. Image-based object
counting datasets have been the mainstream of the current publicly available
datasets. Therefore, we propose a large-scale dataset called YoutubeFish-35,
which contains a total of 35 sequences of high-definition videos with high
frame-per-second and more than 150,000 annotated center points across a
selected variety of scenes. For benchmarking purposes, we select three
mainstream methods for dense object counting and carefully evaluate them on the
newly collected dataset. We propose TransVidCount, a new strong baseline that
combines density and regression branches along the temporal domain in a unified
framework and can effectively tackle indiscernible object counting with
state-of-the-art performance on YoutubeFish-35 dataset.Comment: Accepted by ICASSP 2024 (IEEE International Conference on Acoustics,
Speech, and Signal Processing
- …