6 research outputs found
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Large Vision-Language Models (LVLMs) have advanced considerably, intertwining
visual recognition and language understanding to generate content that is not
only coherent but also contextually attuned. Despite their success, LVLMs still
suffer from the issue of object hallucinations, where models generate plausible
yet incorrect outputs that include objects that do not exist in the images. To
mitigate this issue, we introduce Visual Contrastive Decoding (VCD), a simple
and training-free method that contrasts output distributions derived from
original and distorted visual inputs. The proposed VCD effectively reduces the
over-reliance on statistical bias and unimodal priors, two essential causes of
object hallucinations. This adjustment ensures the generated content is closely
grounded to visual inputs, resulting in contextually accurate outputs. Our
experiments show that VCD, without either additional training or the usage of
external tools, significantly mitigates the object hallucination issue across
different LVLM families. Beyond mitigating object hallucinations, VCD also
excels in general LVLM benchmarks, highlighting its wide-ranging applicability
Tell2Design: A Dataset for Language-Guided Floor Plan Generation
We consider the task of generating designs directly from natural language
descriptions, and consider floor plan generation as the initial research area.
Language conditional generative models have recently been very successful in
generating high-quality artistic images. However, designs must satisfy
different constraints that are not present in generating artistic images,
particularly spatial and relational constraints. We make multiple contributions
to initiate research on this task. First, we introduce a novel dataset,
\textit{Tell2Design} (T2D), which contains more than floor plan designs
associated with natural language instructions. Second, we propose a
Sequence-to-Sequence model that can serve as a strong baseline for future
research. Third, we benchmark this task with several text-conditional image
generation models. We conclude by conducting human evaluations on the generated
samples and providing an analysis of human performance. We hope our
contributions will propel the research on language-guided design generation
forward.Comment: Paper published in ACL2023; Area Chair Award; Best Paper Nominatio
Assessing the impact of extreme droughts on dryland vegetation by multi-satellite solar-induced chlorophyll fluorescence
Satellite-estimated solar-induced chlorophyll fluorescence (SIF) is proven to be an effective indicator for dynamic drought monitoring, while the capability of SIF to assess the variability of dryland vegetation under water and heat stress remains challenging. This study presents an analysis of the responses of dryland vegetation to the worst extreme drought over the past two decades in Australia, using multi-source spaceborne SIF derived from the Global Ozone Monitoring Experiment-2 (GOME-2) and TROPOspheric Monitoring Instrument (TROPOMI). Vegetation functioning was substantially constrained by this extreme event, especially in the interior of Australia, in which there was hardly seasonal growth detected by neither satellite-based observations nor tower-based flux measurements. At a 16-day interval, both SIF and enhanced vegetation index (EVI) can timely capture the reduction at the onset of drought over dryland ecosystems. The results demonstrate that satellite-observed SIF has the potential for characterizing and monitoring the spatiotemporal dynamics of drought over water-limited ecosystems, despite coarse spatial resolution coupled with high-retrieval noise as compared with EVI. Furthermore, our study highlights that SIF retrieved from TROPOMI featuring substantially enhanced spatiotemporal resolution has the promising capability for accurately tracking the drought-induced variation of heterogeneous dryland vegetation
Response of dryland vegetation under extreme wet events with satellite measures of greenness and fluorescence
Extreme wet events in central Australia triggered large vegetation responses that contributed greatly to large global land carbon sink anomalies. There remain significant uncertainties on the extent to which these events over dryland vegetation can be monitored and assessed with satellite data. In this study, we investigated the vegetation responses of the major Australian semiarid biomes to two extreme wet events utilizing multi-satellite observations of (1) solar-induced chlorophyll fluorescence (SIF), as a proxy for photosynthetic activity and (2) the enhanced vegetation index (EVI), as a measure of canopy chlorophyll or greenness. We related these satellite observations with gross primary productivity (GPP) estimated from eddy covariance tower sites, as a performance benchmark.
The C3-dominated Mulga woodland was the most responsive biome to both wet pulses and exhibited the highest sensitivity to soil moisture. The C4-dominated Hummock grassland was more responsive to the 2011 “big wet” event, relative to the later 2016–2017 wet pulse. EVI swiftly responded to the extreme wet events and showed markedly amplified seasonal amplitude, however, there was a time lag as compared with SIF during the post-wet period, presumably due to the relatively slower chlorophyll degradation in contrast with declines in photosynthetic activity. Despite a robust linear SIF-GPP relationship (r2 ranging from 0.59 to 0.85), the spatially coarse SIF derived from the Global Ozone Monitoring Experiment-2 (GOME-2) yielded high retrieval noise over the xeric biomes, hindering its capacity to capture thoroughly the dryland vegetation dynamics in central Australia. Our study highlights that synchronous satellite observations of greenness and fluorescence can potentially offer an improved understanding of dryland vegetation dynamics and can advance our ability to detect ecosystem alterations under future changing climates