1,894 research outputs found

    Boosting Factual Consistency and High Coverage in Unsupervised Abstractive Summarization

    Get PDF
    Abstractive summarization has gained attention because of the positive performance of large-scale, pretrained language models. However, models may generate a summary that contains information different from the original document. This phenomenon is particularly critical under the abstractive methods and is known as factual inconsistency. This study proposes an unsupervised abstractive method for improving factual consistency and coverage by adopting reinforcement learning. The proposed framework includes (1) a novel design to maintain factual consistency with an automatic question-answering process between the generated summary and original document, and (2) a novel method of ranking keywords based on word dependency, where keywords are used to examine the coverage of the key information preserved in the summary. The experimental results show that the proposed method outperforms the reinforcement learning baseline on both the evaluations for factual consistency and coverage

    Project RISE: Recognizing Industrial Smoke Emissions

    Full text link
    Industrial smoke emissions pose a significant concern to human health. Prior works have shown that using Computer Vision (CV) techniques to identify smoke as visual evidence can influence the attitude of regulators and empower citizens to pursue environmental justice. However, existing datasets are not of sufficient quality nor quantity to train the robust CV models needed to support air quality advocacy. We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions. We adopted a citizen science approach to collaborate with local community members to annotate whether a video clip has smoke emissions. Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities. These daytime clips span 30 days over two years, including all four seasons. We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges. Our survey study discussed community feedback, and our data analysis displayed opportunities for integrating citizen scientists and crowd workers into the application of Artificial Intelligence for social good.Comment: Technical repor

    A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video

    Full text link
    Dense object counting or crowd counting has come a long way thanks to the recent development in the vision community. However, indiscernible object counting, which aims to count the number of targets that are blended with respect to their surroundings, has been a challenge. Image-based object counting datasets have been the mainstream of the current publicly available datasets. Therefore, we propose a large-scale dataset called YoutubeFish-35, which contains a total of 35 sequences of high-definition videos with high frame-per-second and more than 150,000 annotated center points across a selected variety of scenes. For benchmarking purposes, we select three mainstream methods for dense object counting and carefully evaluate them on the newly collected dataset. We propose TransVidCount, a new strong baseline that combines density and regression branches along the temporal domain in a unified framework and can effectively tackle indiscernible object counting with state-of-the-art performance on YoutubeFish-35 dataset.Comment: Accepted by ICASSP 2024 (IEEE International Conference on Acoustics, Speech, and Signal Processing
    corecore