14 research outputs found

    Integrating Mechanisms of Visual Guidance in Naturalistic Language Production

    Get PDF
    Situated language production requires the integration of visual attention and lin-guistic processing. Previous work has not conclusively disentangled the role of perceptual scene information and structural sentence information in guiding visual attention. In this paper, we present an eye-tracking study that demonstrates that three types of guidance, perceptual, conceptual, and structural, interact to control visual attention. In a cued language production experiment, we manipulate percep-tual (scene clutter) and conceptual guidance (cue animacy), and measure structural guidance (syntactic complexity of the utterance). Analysis of the time course of lan-guage production, before and during speech, reveals that all three forms of guidance affect the complexity of visual responses, quantified in terms of the entropy of atten-tional landscapes and the turbulence of scan patterns, especially during speech. We find that perceptual and conceptual guidance mediate the distribution of attention in the scene, whereas structural guidance closely relates to scan-pattern complexity. Furthermore, the eye-voice span of the cued object and its perceptual competitor are similar; its latency mediated by both perceptual and structural guidance. These results rule out a strict interpretation of structural guidance as the single dominant form of visual guidance in situated language production. Rather, the phase of the task and the associated demands of cross-modal cognitive processing determine the mechanisms that guide attention

    Attention and memory play different roles in syntactic choice during sentence production

    Get PDF
    Attentional control of referential information is an important contributor to the structure of discourse (Sanford, 2001; Sanford & Garrod, 1981). We investigated how attention and memory interplay during visually situated sentence production. We manipulated speakers’ attention to the agent or the patient of a described event by means of a referential or a dot visual cue (Posner, 1980). We also manipulated whether the cue was implicit or explicit by varying its duration (70 ms versus 700 ms). Participants used passive voice more often when their attention was directed to the patient’s location, regardless of whether the cue duration. This effect was stronger when the cue was explicit rather than implicit, especially for passive-voice sentences. Analysis of sentence onset latencies showed a divergent pattern: Latencies were shorter (1) when the agent was cued, (2) when the cue was explicit and (3) when the (explicit) cue was referential; (1) and (2) indicate facilitated sentence planning when the cue supports a canonical (active voice) sentence frame and when speakers had more time to plan their sentences; (3) suggests that sentence planning was sensitive to whether the cue was informative with regard to the cued referent. We propose that differences between production likelihoods and production latencies indicate distinct contributions from attentional focus and memorial activation to sentence planning: While the former partly predicts syntactic choice, the latter facilitates syntactic assembly (i.e., initiating overt sentence generation)

    Talking about Relations:Factors Influencing the Production of Relational Descriptions

    Get PDF
    In a production experiment (Experiment 1) and an acceptability rating one (Experiment 2), we assessed two factors, spatial position and salience, which may influence the production of relational descriptions (such as the ball between the man and the drawer). In Experiment 1, speakers were asked to refer unambiguously to a target object (a ball). In Experiment 1a, we addressed the role of spatial position, more specifically if speakers mention the entity positioned leftmost in the scene as (first) relatum. The results showed a preference to start with the left entity, however, only as a trend, which leaves room for other factors that could influence spatial reference. Thus, in the following studies, we varied salience systematically, by making one of the relatum candidates animate (Experiment 1b), and by adding attention capture cues, first subliminally by priming one relatum candidate with a flash (Experiment 1c), then explicitly by using salient colors for objects (Experiment 1d). Results indicate that spatial position played a dominant role. Entities on the left were mentioned more often as (first) relatum than those on the right (Experiment 1a, 1b, 1c, 1d). Animacy affected reference production in one out of three studies (in Experiment 1d). When salience was manipulated by priming visual attention or by using salient colors, there were no significant effects (Experiment 1c, 1d). In the acceptability rating study (Experiment 2), participants expressed their preference for specific relata, by ranking descriptions on the basis of how good they thought the descriptions fitted the scene. Results show that participants preferred most the description that had an animate entity as the first mentioned relatum. The relevance of these results for models of reference production is discussed

    On visually-grounded reference production:testing the effects of perceptual grouping and 2D/3D presentation mode

    Get PDF
    When referring to a target object in a visual scene, speakers are assumed to consider certain distractor objects to be more relevant than others. The current research predicts that the way in which speakers come to a set of relevant distractors depends on how they perceive the distance between the objects in the scene. It reports on the results of two language production experiments, in which participants referred to target objects in photo-realistic visual scenes. Experiment 1 manipulated three factors that were expected to affect perceived distractor distance: two manipulations of perceptual grouping (region of space and type similarity), and one of presentation mode (2D vs. 3D). In line with most previous research on visually-grounded reference production, an offline measure of visual attention was taken here: the occurrence of overspecification with color. The results showed effects of region of space and type similarity on overspecification, suggesting that distractors that are perceived as being in the same group as the target are more often considered relevant distractors than distractors in a different group. Experiment 2 verified this suggestion with a direct measure of visual attention, eye tracking, and added a third manipulation of grouping: color similarity. For region of space in particular, the eye movements data indeed showed patterns in the expected direction: distractors within the same region as the target were fixated more often, and longer, than distractors in a different region. Color similarity was found to affect overspecification with color, but not gaze duration or the number of distractor fixations. Also the expected effects of presentation mode (2D vs. 3D) were not convincingly borne out by the data. Taken together, these results provide direct evidence for the close link between scene perception and language production, and indicate that perceptual grouping principles can guide speakers in determining the distractor set during reference production
    corecore