35 research outputs found

    CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos

    Full text link
    Visual information is central to conversation: body gestures and physical behaviour, for example, contribute to meaning that transcends words alone. To date, however, most neural conversational models are limited to just text. We introduce CHAMPAGNE, a generative model of conversations that can account for visual contexts. To train CHAMPAGNE, we collect and release YTD-18M, a large-scale corpus of 18M video-based dialogues. YTD-18M is constructed from web videos: crucial to our data collection pipeline is a pretrained language model that converts error-prone automatic transcripts to a cleaner dialogue format while maintaining meaning. Human evaluation reveals that YTD-18M is more sensible and specific than prior resources (MMDialog, 1M dialogues), while maintaining visual-groundedness. Experiments demonstrate that 1) CHAMPAGNE learns to conduct conversation from YTD-18M; and 2) when fine-tuned, it achieves state-of-the-art results on four vision-language tasks focused on real-world conversations. We release data, models, and code.Comment: ICCV 2023, Project page: https://seungjuhan.me/champagn

    Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms

    Full text link
    Commonsense norms are defeasible by context: reading books is usually great, but not when driving a car. While contexts can be explicitly described in language, in embodied scenarios, contexts are often provided visually. This type of visually grounded reasoning about defeasible commonsense norms is generally easy for humans, but (as we show) poses a challenge for machines, as it necessitates both visual understanding and reasoning about commonsense norms. We construct a new multimodal benchmark for studying visual-grounded commonsense norms: NORMLENS. NORMLENS consists of 10K human judgments accompanied by free-form explanations covering 2K multimodal situations, and serves as a probe to address two questions: (1) to what extent can models align with average human judgment? and (2) how well can models explain their predicted judgments? We find that state-of-the-art model judgments and explanations are not well-aligned with human annotation. Additionally, we present a new approach to better align models with humans by distilling social commonsense knowledge from large language models. The data and code are released at https://seungjuhan.me/normlens.Comment: Published as a conference paper at EMNLP 2023 (long

    Internal segregation and side chain ordering in hairy-rod polypeptide monolayers at the gas/water interface: An x-ray scattering study

    Get PDF
    We report studies of the structure and packing of Langmuirmonolayers (LMs) of polypeptide poly(γ-4-(n-hexadecyloxy)benzyl α,L-glutamate) (C16–O–PBLG) on the surface of water. The molecule is a “hairy rod” and consists of side attachments of hexadecyloxy chains (–O–C16) to the rigid rod-like core made up of α-helical poly(γ-benzyl L-glutamate) (PBLG). Measurements include surface pressure (Π) versus area/monomer (A) isotherms, x-ray specular reflectivity (XR), and grazing incidence diffraction(GID). In contrast to the LM of bare PBLG on water, which undergoes a monolayer/bilayer transition with increasing Π, monolayers of C16–O–PBLG remain stable up to the highest densities. On the basis of XR and GID results, the structure of the C16–O–PBLG monolayer is characterized by the following main features. First, hydrophobicity causes the –O–C16 chains to segregate towards the film/gas interface and away from water and the PBLG cores, which sit parallel to and near the water/film interface. Since the attachment position of some of the side chains is at the core/water interface, the segregation forces these chains into the space between neighboring core rods. Compression associated with increasing Π thickens the film but the internally segregated structure is maintained for all Π (i.e., >∼30 dyne/cm). Second, the C16–O–PBLG rods form domains in which the rods are aligned parallel to each other and to the interface. The correlation length for the interhelix positional order of the rods is short and typically comparable to or less than the length of the rods. With increasing Π the spacing d between nearest-neighbor rods decreases linearly with A at high Π, indicating a direct correspondence between the macroscopic compressibility and the microscopic interhelix compressibility. Third, as Π increases past ∼5 dyne/cm, the local packing of tethered –O–C16 chains displays the same herringbone (HB) order that is common for high-density bulk and monolayer phases of alkyl chains. Various features of the observed GID peaks also imply that the HB order of –O–C16 chains is oriented with respect to the helical axes of aligned PBLG cores. We propose that the HB order is established initially by one-dimensionally confined chains between aligned rods at low Π and grows laterally with compression

    A Comparative Study of Co-residence for the Elderly and their Adult Children between Urban and Rural Area: Empirical Evidences from Korea and the US

    No full text
    Population aging is expected to have a major impact on many aspects of social and economic life in the twenty-first century. The present study intends to investigate whether there are differences to take co-residence for the elderly with their children between Korea and the US. According to a theoretical argument from the mainstream about the living arrangements for the elderly, the theory argues that the living arrangements of the aged have resulted primarily from an increase in the resources of the aged, which has enabled increasing numbers of the elderly to afford independent living. The opposite argues that the decline of the multi-generational family occurred mainly because of increasing opportunities for the young and declining parental control over their children. Adopting the census data of the 1990s and the 2000s from the both countries, the present study found that both theories can be applied to the living arrangements for the elderly in Korea and the U.S. The present study concludes by suggesting some policy implications and future studies
    corecore