8 research outputs found

    Evidence of Human-Like Visual-Linguistic Integration in Multimodal Large Language Models During Predictive Language Processing

    Full text link
    The advanced language processing abilities of large language models (LLMs) have stimulated debate over their capacity to replicate human-like cognitive processes. One differentiating factor between language processing in LLMs and humans is that language input is often grounded in several perceptual modalities, whereas most LLMs process solely text-based information. Multimodal grounding allows humans to integrate - e.g. visual context with linguistic information and thereby place constraints on the space of upcoming words, reducing cognitive load and improving comprehension. Recent multimodal LLMs (mLLMs) combine a visual-linguistic embedding space with a transformer type attention mechanism for next-word prediction. Here we ask whether predictive language processing based on multimodal input in mLLMs aligns with humans. Two-hundred participants watched short audio-visual clips and estimated predictability of an upcoming verb or noun. The same clips were processed by the mLLM CLIP, with predictability scores based on comparing image and text feature vectors. Eye-tracking was used to estimate what visual features participants attended to, and CLIP's visual attention weights were recorded. We find that alignment of predictability scores was driven by multimodality of CLIP (no alignment for a unimodal state-of-the-art LLM) and by the attention mechanism (no alignment when attention weights were perturbated or when the same input was fed to a multimodal model without attention). We further find a significant spatial overlap between CLIP's visual attention weights and human eye-tracking data. Results suggest that comparable processes of integrating multimodal information, guided by attention to relevant visual features, supports predictive language processing in mLLMs and humans.Comment: 13 pages, 4 figures, submitted to journa

    Language development beyond the here‐and‐now: Iconicity and displacement in child‐directed communication

    Get PDF
    Most language use is displaced, referring to past, future, or hypothetical events, posing the challenge of how children learn what words refer to when the referent is not physically available. One possibility is that iconic cues that imagistically evoke properties of absent referents support learning when referents are displaced. In an audio‐visual corpus of caregiver–child dyads, English‐speaking caregivers interacted with their children (N = 71, 24–58 months) in contexts in which the objects talked about were either familiar or unfamiliar to the child, and either physically present or displaced. The analysis of the range of vocal, manual, and looking behaviors caregivers produced suggests that caregivers used iconic cues especially in displaced contexts and for unfamiliar objects, using other cues when objects were present

    Intensification of Antiretroviral Therapy with a CCR5 Antagonist in Patients with Chronic HIV-1 Infection: Effect on T Cells Latently Infected

    Get PDF
    Objective: The primary objective was to assess the effect of MVC intensification on latently infected CD4+ T cells in chronically HIV-1-infected patients receiving antiretroviral therapy. Methods: We performed an open-label pilot phase II clinical trial involving chronically HIV-1-infected patients receiving stable antiretroviral therapy whose regimen was intensified with 48 weeks of maraviroc therapy. We analyzed the latent reservoir, the residual viremia and episomal 2LTR DNA to examine the relationship between these measures and the HIV-1 latent reservoir, immune activation, lymphocyte subsets (including effector and central memory T cells), and markers associated with bacterial translocation. Results: Overall a non significant reduction in the size of the latent reservoir was found (p = 0.068). A mean reduction of 1.82 IUPM was observed in 4 patients with detectable latent reservoir at baseline after 48 weeks of intensification. No effect on plasma residual viremia was observed. Unexpectedly, all the patients had detectable 2LTR DNA circles at week 24, while none of them showed those circles at the end of the study. No changes were detected in CD4+ or CD8+ counts, although a significant decrease was found in the proportion of HLA-DR+/CD38+ CD4+ and CD8+ T-cells. LPS and sCD14 levels increased. Conclusions: Intensification with MVC was associated with a trend to a decrease in the size of the latent HIV-1 reservoir in memory T cells. No impact on residual viremia was detected. Additional studies with larger samples are needed to confirm the results

    Language development beyond the here-and-now: iconicity and displacement in child-directed communication

    No full text
    Most language use is displaced, referring to past, future or hypothetical events. Displacement poses an important challenge for language learning. How can children learn what words refer to when the referent is not physically available? We suggest that caregivers provide children with iconic vocal and gestural cues that imagistically evoke properties of absent referents to support displaced learning. We collected an audio-visual corpus of English-speaking caregiver-child interactions (N = 71, 24-58 months, 37 female) and annotated the range of vocal and manual behaviours caregivers produced. We found that caregivers used iconic cues especially in displaced contexts, using other cues when objects were present. Thus, we map caregivers’ non-linguistic behaviours, showing that they provide iconic cues to support displaced language learning and processing

    Multimodal cues in child-directed communication

    No full text
    Project looking at multimodal cues (onomatopoeia, gesture, points, object manipulation, eye gaze) in caregivers' communication to their English-speaking toddlers
    corecore