29,073 research outputs found

    GLIMPSED:Improving natural language processing with gaze data

    Get PDF

    Return-sweep saccades during reading in adults and children

    Get PDF
    During reading, eye movement patterns differ between children and adults. Children make more fixations that are longer in duration and make shorter saccades. Return-sweeps are saccadic eye movements that move a reader’s fixation to a new line of text. Return-sweeps move fixation further than intra-line saccades and often undershoot their target. This necessitates a corrective saccade to bring fixation closer to the start of the line. There have been few empirical investigations of return-sweep saccades in adults, and even fewer in children. In the present study, we examined return-sweeps of 47 adults and 48 children who read identical multiline texts. We found that children launch their return-sweeps closer to the end of the line and target a position closer to the left margin. Therefore, children fixate more extreme positions on the screen when reading for comprehension. Furthermore, children required a corrective saccade following a return-sweep more often than adults. Analysis of the duration of the fixation preceding the corrective saccade indicated that children are as efficient as adults at responding to retinal feedback following a saccade. Rather than consider differences in adult’s and children’s return-sweep behaviour an artefact of oculomotor control, we believe that these differences represent adult’s ability to utilise parafoveal processing to encode text at extreme positions

    Expectation incongruence in music and code reading: The eye-tracking approach

    Get PDF
    Humans create and use different kinds of languages in order to store, view, and convey various types of information. Natural languages, such as English, allow people to communicate with each other in everyday and professional contexts. In contrast, symbolic languages, such as Western music notation or programming languages, enable people to make use of technical devices like musical instruments or computers. Research on the eye movements of expert musicians and programmers has revealed certain similarities in how these symbolic languages are read: unlike text reading, experts read music and code with more regressive eye movements. The current dissertation is the first project that explores music and code reading together. It focuses on one of the aspects of music and code reading that is equally important for both symbolic languages—the skill of working with unexpected information from a notation. In music and programming, this skill is especially required in tasks such as handling surprising melodic patterns and debugging, respectively. This dissertation had three main aims: (1) theoretical exploration of similarities and differences in creating expectations that help with pattern recognition in music and programming; (2) development (in music reading) and creation (in code reading) of research methodologies that can be applied in research on incongruent patterns; and (3) exploration of the cognitive processing of incongruent notation in experienced music readers and one experienced code reader. Surprising elements in familiar patterns hamper their recognition. Article I presents a theoretical exploration of the similarities and differences in building expectations that allow pattern recognition in music and programming. The proposed prediction model, which serves as the solution to Aim 1, includes three components that are common for both music and programming: (1) knowledge of language systems, (2) knowledge of meaning, and (3) knowledge of context. In addition, it also contains two components that differ for music and programming: (4) translation of information and (5) temporal and motor requirements. Experiments presented in this dissertation can be considered to be the first steps toward looking at certain components of the proposed prediction model in detail when prediction works normally (congruent notation) and when prediction is violated (incongruent notation). In order to study the reading of surprising incongruent patterns in music and code, special experimental settings, which provide the solution to Aim 2, were developed for music reading and created for code reading. Hence, the selected setup for the music reading study was based on a prior study (Penttinen et al., 2015), where incongruences were introduced in the “Mary Had a Little Lamb” melody and all music performances were temporarily controlled using a metronome. Experiment 1 developed this set-up by inserting incongruent notes into two different tonalities and by asking participants to play on a piano or sing from the notation. Thus, the music reading experiment focused on the first, second, fourth, and fifth components of the proposed model in music reading. It explored how the meaning of congruent and incongruent musical symbols is processed by experienced music readers. In addition, it also explored the translation of music information into two different performance ways (singing and playing piano) that have different motor requirements. Combining three different eye movement parameters allowed the researcher to describe different aspects of the cognitive processing of incongruent music reading—the temporal aspect, with the help of the eye-time span parameter (ETS), the cognitive effort aspect with the help of the mean pupil size parameter measured only in first-pass fixations, and the attention aspect with the help of the first-pass fixation duration. Experiment 2 on code reading was carefully designed on the basis of the music reading study. Consequently, incongruences were introduced in different parts of the familiar notation. The Bubble sort algorithm—a well-known sorting algorithm—was chosen as an analogue of the “Mary Had a Little Lamb” melody in programming. As in the music reading study, all code reading performances were temporarily controlled. The case code study provided some insights into the first and second components of the proposed model in programming by investigating how an experienced programmer reads sorting algorithms with and without surprising patterns. It particularly focuses on the phenomenon of an experienced reader overlooking the surprising pattern, which is considered to be the original one instead so-called proof-readers’ error. In addition, this study explored the issue of the unit of code reading analysis by comparing two different options: lines and elements. The study introduced saccade velocity as a parameter of cognitive effort for the incongruent code reading analysis. Research findings from these experimental studies provided the solution to Aim 3 and revealed that—in both music and code reading—incongruent patterns in the notation led to changes in fixation and cognitive effort parameters (pupil size and saccadic velocity). In contrast to code reading, strict temporal requirements for the processing of incongruence exist in music reading. The application of the eye-time span (ETS) parameter that describes the distance between the performer’s gaze and musical time, allowed the researcher to investigate the temporal aspect of incongruence processing in the music reading experiment. Hence, experienced readers had longer ETS when they approached the incongruent part of the notation and shorter ETS when they were in the process of struggling with the incongruent part. In addition to incongruent reading, the difference in the performance mode of the same music task associated with the translation of information and motor requirements was studied in the music reading experiment by comparing singing and playing from music scores. Despite the fact that the participants played incongruent melodies better than they sang them, the analysis of eye movement parameters allowed the researcher to discover that singing might be less cognitively demanding than playing. These findings are discussed within the proposed theoretical model of prediction and associated expertise theories.Odotusten vastaisten symbolien lukeminen: katseenseurantatutkimus nuotin- ja koodinluvusta Ihmiset luovat ja käyttävät erilaisia kieliä tallentaakseen, tarkastellakseen ja välittääkseen informaatiota. Luonnolliset kielet, kuten englanti, mahdollistavat ihmisten välisen kommunikaation arkisissa ja ammatillisissa tilanteissa. Sen sijaan symboliset kielet, kuten länsimainen nuottikirjoitus tai ohjelmointikielet, mahdollistavat erilaisten laitteiden, kuten soittimien tai tietokoneiden, operoinnin. Taitavien muusikkojen ja koodinlukijoiden silmänliikkeiden tutkimus on paljastanut joitakin samankaltaisuuksia siitä, miten näitä kahta symbolikieltä luetaan: toisin kuin tekstin lukemisessa, taitavat nuotin- ja koodinlukijat tekevät enemmän regressiivisiä, eli taaksepäin suuntautuvia silmänliikkeitä. Tämä väitöstutkimus on ensimmäinen tutkimushanke, jossa tarkastellaan nuotin- ja koodinlukua rinnakkain. Tutkimus keskittyy tiettyyn, molemmissa symbolikielissä tärkeään piirteeseen, eli taitoon selvitä lukemisen aikana notaatiossa havaittuun yllättävään informaatioon. Sekä musiikin että ohjelmoinnin aloilla tätä taitoa tarvitaan silloin, kun lukijan täytyy käsitellä yllättäviä melodisia kuvioita musiikkikappaletta lukiessaan tai etsiä virheitä koodista. Tällä väitöstutkimuksella oli kolme päätavoitetta: (1) teoreettinen pohdinta nuotin- ja koodinluvun yhtäläisyyksistä ja eroavaisuuksista ja erityisesti siitä, miten taitavat lukijat muodostavat ennakko-oletuksia lukemastaan symbolien tunnistamisen helpottamiseksi; (2) menetelmien kehittäminen (nuotinluvussa) ja luominen (koodinluvussa) inkongruenttien, eli epäyhdenmukaisten, kuvioiden lukemisen tutkimukseen; ja (3) taitavien nuotinlukijoiden ja yhden taitavan koodinlukijan kognitiivisen prosessoinnin tutkiminen silloin, kun lukijat käsittelevät inkongruenttia informaatiota. Yllättävät elementit tutussa visuaalisessa materiaalissa vaikeuttavat kyseessä olevan materiaalin prosessointia. Artikkelissa I pohditaan nuotin- ja koodinluvun teoreettisia eroja ja eroavaisuuksista ja sitä, miten taitavat lukijat muodostavat ennakko-oletuksia lukiessaan. Ehdotettu ennustusmalli, jonka avulla vastataan päätavoitteeseen 1, sisältää kolme molemmille symbolikielille yhteistä komponenttia: (1) tiedon kielijärjestelmästä, (2) tiedon merkityksestä ja (3) tiedon kontekstista. Tämän lisäksi malli sisältää kaksi komponenttia, joissa nuotin- ja koodinluku eroavat toisistaan: (4) informaation kääntäminen laitteelle ja (5) temporaaliset ja motoriset vaatimukset. Tässä väitöstutkimuksessa esiteltävät empiiriset osahankkeet olivat ensiaskelia ennustusmallin komponenttien tutkimuksessa. Kahdessa osahankkeessa tarkasteltiin yksityiskohtaisesti tilanteita, joissa ennakko-oletuksia voi hyödyntää tavalliseen tapaan (kongruentti notaatio) ja kun odotukset eivät toteudu (inkongruentti notaatio). Tässä väitöstutkimuksessa kehitettiin koeasetelmia nuotin- ja koodinluvun aikaisten, yllättävien ja inkongruenttien kuvioiden lukemisen tutkimusta varten (päätavoite 2). Nuotinlukuaiheinen koeasetelma pohjautui aikaisempaan tutkimukseen (Penttinen et al., 2015), jossa epäyhdenmukaisuuksia sijoitettiin tuttuun ”Maijall’ oli karitsa” –melodiaan ja soittosuoritusten ajoitusta kontrolloitiin metronomin avulla. Osatutkimus 1 kehitti tätä asetelmaa edelleen esittämällä tutun kappaleen osallistujille kahdessa eri sävellajissa ja pyytämällä osallistujia toteuttamaan melodia kahdella eri tavalla, soittaen ja laulaen. Osatutkimus 1 keskittyi siis ennustusmallin ensimmäiseen, toiseen, neljänteen ja viidenteen komponenttiin nuotinlukemisen näkökulmasta. Osatutkimuksessa 1 tutkittiin kuinka taitavat nuotinlukijat prosessoivat kongruenttien ja inkongruenttien nuottisymbolien merkityksiä. Tämän lisäksi osahanke selvitti sitä, miten nuottiinformaatio käännettiin kahdelle motorisilta vaatimuksiltaan erilaiselle ”soittimelle” (pianonsoitto ja laulaminen). Kognitiivisia prosesseja inkongruentin materiaalin lukemisen aikana kuvailtiin kolmen eri silmänliikemuuttujan turvin: lukuprosessin ajallista etenemistä selvitettiin eye-time span –mittarin (ETS) avulla, kognitiivista työmäärää mittaamalla pupillin koon vaihtelua, ja fiksaatioiden kestot kertoivat huomion kohdistumisesta notaation eri osiin ensilukemisen aikana. Koodinlukuun keskittyvä osatutkimus 2 suunniteltiin osatutkimuksen 1 koeasetelman pohjalta ja epäyhdenmukaisuudet sijoitettiin jälleen tuttuun notaatioon. Koodinlukukokeessa nuotinlukukokeen ”Maijall’ oli karitsa” –melodian tilalle valittiin hyvin tunnettu kuplalajittelualgoritmi, ja myös koodinlukutehtävässä kontrolloitiin ajankäyttöä. Tapaustutkimuksessa selvitettiin, miten kokenut ohjelmoija luki lajittelualgoritmia silloin kun siinä joko oli tai ei ollut inkongruentteja kohtia. Näin voitiin tarkastella ennustusmallin ensimmäistä ja toista komponenttia koodinlukemisen näkökulmasta. Tapaustutkimus keskittyi erityisesti tilanteeseen, jossa kokenut lukija ohitti yllättävät elementit notaatiossa ja tulkitsi inkongruentin algoritmin oikeaksi ja alkuperäiseksi (ns. proof-readers’ error). Tämän lisäksi osatutkimuksessa 2 testattiin koodinlukututkimuksiin sopivia analyysiyksiköitä vertaamalla kahta vaihtoehtoa, rivejä ja elementtejä, ja esiteltiin sakkadien nopeus kognitiivisen työmäärän tarkasteluun sopivana, koodinlukututkimuksille uutena mittarina. Osatutkimusten 1 ja 2 perusteella vastattiin päätavoitteeseen 3. Osatutkimuksissa selvisi, että sekä nuotin- että koodinlukutilanteissa inkongruentit kohdat notaatiossa johtivat muutoksiin fiksaatio- ja kognitiivisen työmäärän mittareissa (pupillin koko ja sakkadin nopeus). Toisin kuin koodinluvussa, tiukat temporaaliset rajoitteet säätelevät inkongruenssin prosessointia nuotinluvun aikana. Tästä syystä nuotinlukukokeessa hyödynnettiin ETS-mittaria, joka kertoo katseen kohdan ja musiikillisen ajan välisestä etäisyydestä. Taitavien nuotinlukijoiden ETS oli pidempi, kun he lukiessaan lähestyivät inkongruenttia kohtaa, ja lyhyempi, kun he soittivat tätä samaista kohtaa. Inkongruentin kohdan lukemisen lisäksi tutkittiin kahta erilaista esitystapaa (laulaminen ja soittaminen), sillä esitystapa liittyy nuotti-informaation kääntämiseen oikeiksi motoriksiksi liikkeiksi. Vaikka osallistujat soittivat inkongruentit melodiat paremmin kuin he lauloivat ne, silmänliikkeiden tarkastelu osoitti että laulaminen saattoi silti olla osallistujille kognitiivisesti vähemmän vaativaa kuin soittaminen. Näitä havaintoja pohditaan väitöskirjatutkimuksessa esitetyn ennustusmallin sekä asiantuntijuusteorioiden valossa

    An Investigation of Reading Development Through Sensitivity to Sublexical Units

    Get PDF
    The present dissertation provides a novel perspective to the study of reading, focusing on sensitivity to sublexical units across reading development. Work towards this thesis has been conducted at SISSA and Macquarie University. The first study is an eye tracking experiment on natural reading, with 140 developing readers and 33 adult participants, who silently read multiline passages from story books in Italian. A developmental database of eye tracking during natural reading was created, filling a gap in the literature. We replicated well-documented developmental trends of reading behavior (e.g., reading rate and skipping rate increasing with age) and effects of word length and frequency on eye tracking measures. The second study, in collaboration with Dr Jon Carr, is a methodological paper presenting algorithms for accuracy enhancement of eye tracking recordings in multiline reading. Using the above-mentioned dataset and computational simulations, we assessed the performance of several algorithms (including two novel methods that we proposed) on the correction of vertical drift, the progressive displacement of fixation registrations on the vertical axis over time. We provided guidance for eye tracking researchers in the application of these methods, and one of the novel algorithms (based on Dynamic Time Warping) proved particularly promising in realigning fixations, especially in child recordings. This manuscript has recently been accepted for publication in Behavior Research Methods. In the third study, I examined sensitivity to statistical regularities in letter co-occurrence throughout reading development, by analysing the effects of n-gram frequency metrics on eye-tracking measures. To this end, the EyeReadIt eye-tracking corpus (presented in the first study) was used. Our results suggest that n-gram frequency effects (in particular related to maximum/average frequency metrics) are present even in developing readers, suggesting that sensitivity to sublexical orthographic regularities in reading is present as soon as the developing reading system can pick it up \u2013 in the case of this study, as early as in third grade. The results bear relevant implications for extant theories of learning to read, which largely overlook the contribution of statistical learning to reading acquisition. The fourth study is a magnetoencephalography experiment conducted at Macquarie University, in collaboration with Dr Lisi Beyersmann, Prof Paul Sowman, and Prof Anne Castles, on 28 adults and 17 children (5th and 6th grade). We investigated selective neural responses to morphemes at different stages of reading development, using Fast Periodic Visual Stimulation (FPVS) combined with an oddball design. Participants were presented with rapid sequences (6 Hz) of pseudoword combinations of stem/nonstem and suffix/nonsuffix components. Interleaved in this stream, oddball stimuli appeared periodically every 5 items (1.2 Hz) and were specifically designed to examine stem or suffix detection (e.g., stem+suffix oddballs, such as softity, were embedded in a sequence of nonstem+suffix base items, such as terpity). We predicted that neural responses at the oddball stimulation frequency (1.2 Hz) would reflect the detection of morphemes in the oddball stimuli. Sensor-level analysis revealed a selective response in a left occipito-temporal region of interest when the oddball stimuli were fully decomposable pseudowords. This response emerged for adults and children alike, showing that automatic morpheme identification occurs at relatively early stages of reading development, in line with major accounts of morphological decomposition. Critically, these findings also suggest that morpheme identification is modulated by the context in which the morphemes appear

    SEAM: An Integrated Activation-Coupled Model of Sentence Processing and Eye Movements in Reading

    Full text link
    Models of eye-movement control during reading, developed largely within psychology, usually focus on visual, attentional, lexical, and motor processes but neglect post-lexical language processing; by contrast, models of sentence comprehension processes, developed largely within psycholinguistics, generally focus only on post-lexical language processes. We present a model that combines these two research threads, by integrating eye-movement control and sentence processing. Developing such an integrated model is extremely challenging and computationally demanding, but such an integration is an important step toward complete mathematical models of natural language comprehension in reading. We combine the SWIFT model of eye-movement control (Seelig et al., 2020, doi:10.1016/j.jmp.2019.102313) with key components of the Lewis and Vasishth sentence processing model (Lewis & Vasishth, 2005, doi:10.1207/s15516709cog0000_25). This integration becomes possible, for the first time, due in part to recent advances in successful parameter identification in dynamical models, which allows us to investigate profile log-likelihoods for individual model parameters. We present a fully implemented proof-of-concept model demonstrating how such an integrated model can be achieved; our approach includes Bayesian model inference with Markov Chain Monte Carlo (MCMC) sampling as a key computational tool. The integrated model, SEAM, can successfully reproduce eye movement patterns that arise due to similarity-based interference in reading. To our knowledge, this is the first-ever integration of a complete process model of eye-movement control with linguistic dependency completion processes in sentence comprehension. In future work, this proof of concept model will need to be evaluated using a comprehensive set of benchmark data

    Linguistic processes do not beat visuo-motor constraints, but they modulate where the eyes move regardless of word boundaries: Evidence against top-down word-based eye-movement control during reading

    Get PDF
    International audienceWhere readers move their eyes, while proceeding forward along lines of text, has long been assumed to be determined in a top-down word-based manner. According to this classical view, readers of alphabetic languages would invariably program their saccades towards the center of peripheral target words, as selected based on the (expected) needs of ongoing (word-identification) processing, and the variability in within-word landing positions would exclusively result from systematic and random errors. Here we put this predominant hypothesis to a strong test by estimating the respective influences of language-related variables (word frequency and word predictability) and lower-level visuo-motor factors (word length and saccadic launch-site distance to the beginning of words) on both word-skipping likelihood and within-word landing positions. Our eye-movement data were collected while forty participants read 316 pairs of sentences, that differed only by one word, the prime; this was either semantically related or unrelated to a following test word of variable frequency and length. We found that low-level visuo-motor variables largely predominated in determining which word would be fixated next, and where in a word the eye would land. In comparison, language-related variables only had tiny influences. Yet, linguistic variables affected both the likelihood of word skipping and within-word initial landing positions, all depending on the words’ length and how far on average the eye landed from the word boundaries, but pending the word could benefit from peripheral preview. These findings provide a strong case against the predominant word-based account of eye-movement guidance during reading, by showing that saccades are primarily driven by low-level visuo-motor processes, regardless of word boundaries, while being overall subject to subtle, one-off, language-based modulations. Our results also suggest that overall distributions of saccades’ landing positions, instead of truncated within-word landing-site distributions, should be used for a better understanding of eye-movement guidance during reading

    An Application of Deep-Learning to Understand Human Perception of Art

    Get PDF
    Eye movement patterns are known to differ when looking at stimuli given a different task, but less is known about how these patterns change as a function of expertise. When a particular visual pattern is viewed, a particular sequence of eye movements are executed and this sequence is defined as scanpath. In this work we made an attempt to answer the question, “Do art novices and experts look at paintings differently?” If they do, we should be able to discriminate between the two groups using machine learning applied to their scanpaths. This can be done using algorithms for Multi-Fixation Pattern Analyses (MFPA). MFPA is a family of machine learning algorithms for making inferences about people from their gaze patterns. MFPA and related approaches have been widely used to study viewing behavior while performing visual tasks, but earlier approaches only used gaze position (x, y) information with duration and temporal order and not the actual visual features in the image. In this work, we extend MFPA algorithms to use visual features in trying to answer a question that has been overlooked by most early studies, i.e. if there is a difference found between experts and novices, how different are their viewing patterns and do these differences exist for both low- and high-level image features. To address this, we combined MFPA with a deep Convolutional Neural Network (CNN). Instead of converting a trial’s 2-D fixation positions into Fisher Vectors, we extracted image features surrounding the fixations using a deep CNN and turn them into Fisher Vectors for a trial. The Fisher Vector is an image representation obtained by pooling local image features. It is frequently used as a global image descriptor in visual classification. We call this approach MFPA-CNN. While CNNs have been previously used to recognize and classify objects from paintings, this work goes the extra step to study human perception of paintings. Ours is the first attempt to use MFPA and CNNs to study the viewing patterns of the subjects in the field of art. If our approach is successful in differentiating novices from experts with and without instructions when both low- and high-level CNN image features were used, we could then demonstrate that novices and experts view art differently. The outcome of this study could be then used to further investigate what image features the subjects are concentrating on. We expect this work to influence further research in image perception and experimental aesthetics

    Vector Associative Maps: Unsupervised Real-time Error-based Learning and Control of Movement Trajectories

    Full text link
    This article describes neural network models for adaptive control of arm movement trajectories during visually guided reaching and, more generally, a framework for unsupervised real-time error-based learning. The models clarify how a child, or untrained robot, can learn to reach for objects that it sees. Piaget has provided basic insights with his concept of a circular reaction: As an infant makes internally generated movements of its hand, the eyes automatically follow this motion. A transformation is learned between the visual representation of hand position and the motor representation of hand position. Learning of this transformation eventually enables the child to accurately reach for visually detected targets. Grossberg and Kuperstein have shown how the eye movement system can use visual error signals to correct movement parameters via cerebellar learning. Here it is shown how endogenously generated arm movements lead to adaptive tuning of arm control parameters. These movements also activate the target position representations that are used to learn the visuo-motor transformation that controls visually guided reaching. The AVITE model presented here is an adaptive neural circuit based on the Vector Integration to Endpoint (VITE) model for arm and speech trajectory generation of Bullock and Grossberg. In the VITE model, a Target Position Command (TPC) represents the location of the desired target. The Present Position Command (PPC) encodes the present hand-arm configuration. The Difference Vector (DV) population continuously.computes the difference between the PPC and the TPC. A speed-controlling GO signal multiplies DV output. The PPC integrates the (DV)·(GO) product and generates an outflow command to the arm. Integration at the PPC continues at a rate dependent on GO signal size until the DV reaches zero, at which time the PPC equals the TPC. The AVITE model explains how self-consistent TPC and PPC coordinates are autonomously generated and learned. Learning of AVITE parameters is regulated by activation of a self-regulating Endogenous Random Generator (ERG) of training vectors. Each vector is integrated at the PPC, giving rise to a movement command. The generation of each vector induces a complementary postural phase during which ERG output stops and learning occurs. Then a new vector is generated and the cycle is repeated. This cyclic, biphasic behavior is controlled by a specialized gated dipole circuit. ERG output autonomously stops in such a way that, across trials, a broad sample of workspace target positions is generated. When the ERG shuts off, a modulator gate opens, copying the PPC into the TPC. Learning of a transformation from TPC to PPC occurs using the DV as an error signal that is zeroed due to learning. This learning scheme is called a Vector Associative Map, or VAM. The VAM model is a general-purpose device for autonomous real-time error-based learning and performance of associative maps. The DV stage serves the dual function of reading out new TPCs during performance and reading in new adaptive weights during learning, without a disruption of real-time operation. YAMs thus provide an on-line unsupervised alternative to the off-line properties of supervised error-correction learning algorithms. YAMs and VAM cascades for learning motor-to-motor and spatial-to-motor maps are described. YAM models and Adaptive Resonance Theory (ART) models exhibit complementary matching, learning, and performance properties that together provide a foundation for designing a total sensory-cognitive and cognitive-motor autonomous system.National Science Foundation (IRI-87-16960, IRI-87-6960); Air Force Office of Scientific Research (90-0175); Defense Advanced Research Projects Agency (90-0083
    • …
    corecore