387 research outputs found

    Max-Planck-Institute for Psycholinguistics: Annual Report 2001

    No full text

    Max Planck Institute for Psycholinguistics: Annual report 1996

    No full text

    Humanoid Robots

    Get PDF
    For many years, the human being has been trying, in all ways, to recreate the complex mechanisms that form the human body. Such task is extremely complicated and the results are not totally satisfactory. However, with increasing technological advances based on theoretical and experimental researches, man gets, in a way, to copy or to imitate some systems of the human body. These researches not only intended to create humanoid robots, great part of them constituting autonomous systems, but also, in some way, to offer a higher knowledge of the systems that form the human body, objectifying possible applications in the technology of rehabilitation of human beings, gathering in a whole studies related not only to Robotics, but also to Biomechanics, Biomimmetics, Cybernetics, among other areas. This book presents a series of researches inspired by this ideal, carried through by various researchers worldwide, looking for to analyze and to discuss diverse subjects related to humanoid robots. The presented contributions explore aspects about robotic hands, learning, language, vision and locomotion

    Acquisition of Verbal Aspect in L2 English by advanced learners with L1 Russian and L1 Norwegian: A web-based eye tracking study

    Get PDF
    It is well known that the similarities between L1 and L2 (also L3, etc.) facilitate language acquisition, whereas significant differences between them result in non-facilitating effects. These effects are known as Cross-Linguistic Influence (CLI). The main objective of the current study is to investigate the CLI, experienced by high proficient L2 English speakers when the grammatical aspect is being acquired. In order to investigate and compare different L2 processing patterns, I tested L1 speakers of a language with an obligatory contrast between perfective and imperfective aspect (Russian) and a language without such distinction (Norwegian). The participants recruited for this experiment were university students with advanced level of proficiency in English, and the groups were closely matched by proficiency. From the perspective of grammatical aspect, none of these languages bears complete similarity to English. Moreover, these languages differ dramatically in how they encode aspectual semantics in their grammar, hence we hoped to find substantial differences in processing and acquisition of the English system due to CLI. In Russian, with its perfective/imperfective contrast, aspectual information is obligatorily encoded in the verb form. Speakers of Russian link imperfective aspect with ongoing events and perfective aspect with completed events. In Norwegian, on the other hand, there is no grammatical way of encoding aspectual differences, i.e., the same verbal forms are employed to refer to either ongoing or completed events. As for English, there exist specialized forms that encode progressive meaning (e.g., Present and Past Progressive), but the jury is still out as to whether the Simple Past forms encode perfectivity or should be treated as neutral aspect. The goal of this thesis is thus to investigate semantic acquisition and processing of the English Past Progressive and Simple Past forms by studying online changes in gaze patterns by L2 listeners with L1 Russian and L1 Norwegian. The thesis aims to answer the following research questions: RQ 1: Do native speakers of Russian have strong opposition between Simple Past and Progressive Past in L2 English due to the transfer of similar opposition from their L1 on the processing level? RQ 2: How will Norwegian L1 speakers behave in the online eye-tracking Picture-Sentence Matching task? RQ 3: Is there any difference between online and offline results in the L1 Norwegian or the L1 Russian group? The methodology used to answer these research questions was web-based eye tracking. The experiment was implemented on JATOS platform using Webgazer.js software. The participants were asked to perform a sentence-picture matching task: they viewed visual displays with two pictures on the screen and listened to pre-recorded audio stimuli while their eye movements were tracked. This setup allowed for collecting both processing and conscious choice data performed after each sentence. The task contained audio stimuli of sentences with the Past Simple and Past Progressive verbal forms, and visual stimuli, depicting ongoing and completed events. The results of the experiment show that: 1) Both groups have a strong preference for an ongoing event picture when they listen to sentences involving the verb in the Past Progressive form. The offline responses also reflect this preference. This corresponds to the pattern exhibited by L1 speakers of English. 2) L1 speakers of Russian have a strong preference for a completed event picture when they listen to sentences involving the verb in the Past Simple form. The offline responses also reflect this preference. This doesn’t correspond to the pattern exhibited by L1 speakers of English, who had no preference for either completed or ongoing event picture in this condition. 3) L1 speakers of Norwegian have a weaker, but still sunstantial preference for an ongoing event picture when they listen to sentences involving the verb in the Past Simple form. The offline responses also reflect this preference. This doesn’t correspond to the pattern exhibited by L1 speakers of English, who had no preference for either completed or ongoing event picture in this condition. Taken together, the results indicate that while learners from both L1s converge on target-like interpretation of the Past Progressive form, their interpretation of the Past Simple form is deviant from that of the native speakers even at advanced levels of proficiency. We argue that this is likely due to CLI, with L1 Russian speakers mapping the semantic opposition between imperfective and perfective aspect onto English, and L1 Norwegians making a link between the English and the Norwegian Simple Past tense forms

    Turn-Taking in Human Communicative Interaction

    Get PDF
    The core use of language is in face-to-face conversation. This is characterized by rapid turn-taking. This turn-taking poses a number central puzzles for the psychology of language. Consider, for example, that in large corpora the gap between turns is on the order of 100 to 300 ms, but the latencies involved in language production require minimally between 600ms (for a single word) or 1500 ms (for as simple sentence). This implies that participants in conversation are predicting the ends of the incoming turn and preparing in advance. But how is this done? What aspects of this prediction are done when? What happens when the prediction is wrong? What stops participants coming in too early? If the system is running on prediction, why is there consistently a mode of 100 to 300 ms in response time? The timing puzzle raises further puzzles: it seems that comprehension must run parallel with the preparation for production, but it has been presumed that there are strict cognitive limitations on more than one central process running at a time. How is this bottleneck overcome? Far from being 'easy' as some psychologists have suggested, conversation may be one of the most demanding cognitive tasks in our everyday lives. Further questions naturally arise: how do children learn to master this demanding task, and what is the developmental trajectory in this domain? Research shows that aspects of turn-taking such as its timing are remarkably stable across languages and cultures, but the word order of languages varies enormously. How then does prediction of the incoming turn work when the verb (often the informational nugget in a clause) is at the end? Conversely, how can production work fast enough in languages that have the verb at the beginning, thereby requiring early planning of the whole clause? What happens when one changes modality, as in sign languages -- with the loss of channel constraints is turn-taking much freer? And what about face-to-face communication amongst hearing individuals -- do gestures, gaze, and other body behaviors facilitate turn-taking? One can also ask the phylogenetic question: how did such a system evolve? There seem to be parallels (analogies) in duetting bird species, and in a variety of monkey species, but there is little evidence of anything like this among the great apes. All this constitutes a neglected set of problems at the heart of the psychology of language and of the language sciences. This research topic welcomes contributions from right across the board, for example from psycholinguists, developmental psychologists, students of dialogue and conversation analysis, linguists interested in the use of language, phoneticians, corpus analysts and comparative ethologists or psychologists. We welcome contributions of all sorts, for example original research papers, opinion pieces, and reviews of work in subfields that may not be fully understood in other subfields

    Computer vision methods for unconstrained gesture recognition in the context of sign language annotation

    Get PDF
    Cette thèse porte sur l'étude des méthodes de vision par ordinateur pour la reconnaissance de gestes naturels dans le contexte de l'annotation de la Langue des Signes. La langue des signes (LS) est une langue gestuelle développée par les sourds pour communiquer. Un énoncé en LS consiste en une séquence de signes réalisés par les mains, accompagnés d'expressions du visage et de mouvements du haut du corps, permettant de transmettre des informations en parallèles dans le discours. Même si les signes sont définis dans des dictionnaires, on trouve une très grande variabilité liée au contexte lors de leur réalisation. De plus, les signes sont souvent séparés par des mouvements de co-articulation. Cette extrême variabilité et l'effet de co-articulation représentent un problème important dans les recherches en traitement automatique de la LS. Il est donc nécessaire d'avoir de nombreuses vidéos annotées en LS, si l'on veut étudier cette langue et utiliser des méthodes d'apprentissage automatique. Les annotations de vidéo en LS sont réalisées manuellement par des linguistes ou experts en LS, ce qui est source d'erreur, non reproductible et extrêmement chronophage. De plus, la qualité des annotations dépend des connaissances en LS de l'annotateur. L'association de l'expertise de l'annotateur aux traitements automatiques facilite cette tâche et représente un gain de temps et de robustesse. Le but de nos recherches est d'étudier des méthodes de traitement d'images afin d'assister l'annotation des corpus vidéo: suivi des composantes corporelles, segmentation des mains, segmentation temporelle, reconnaissance de gloses. Au cours de cette thèse nous avons étudié un ensemble de méthodes permettant de réaliser l'annotation en glose. Dans un premier temps, nous cherchons à détecter les limites de début et fin de signe. Cette méthode d'annotation nécessite plusieurs traitements de bas niveau afin de segmenter les signes et d'extraire les caractéristiques de mouvement et de forme de la main. D'abord nous proposons une méthode de suivi des composantes corporelles robuste aux occultations basée sur le filtrage particulaire. Ensuite, un algorithme de segmentation des mains est développé afin d'extraire la région des mains même quand elles se trouvent devant le visage. Puis, les caractéristiques de mouvement sont utilisées pour réaliser une première segmentation temporelle des signes qui est par la suite améliorée grâce à l'utilisation de caractéristiques de forme. En effet celles-ci permettent de supprimer les limites de segmentation détectées en milieu des signes. Une fois les signes segmentés, on procède à l'extraction de caractéristiques visuelles pour leur reconnaissance en termes de gloses à l'aide de modèles phonologiques. Nous avons évalué nos algorithmes à l'aide de corpus internationaux, afin de montrer leur avantages et limitations. L'évaluation montre la robustesse de nos méthodes par rapport à la dynamique et le grand nombre d'occultations entre les différents membres. L'annotation résultante est indépendante de l'annotateur et représente un gain de robustese important.This PhD thesis concerns the study of computer vision methods for the automatic recognition of unconstrained gestures in the context of sign language annotation. Sign Language (SL) is a visual-gestural language developed by deaf communities. Continuous SL consists on a sequence of signs performed one after another involving manual and non-manual features conveying simultaneous information. Even though standard signs are defined in dictionaries, we find a huge variability caused by the context-dependency of signs. In addition signs are often linked by movement epenthesis which consists on the meaningless gesture between signs. The huge variability and the co-articulation effect represent a challenging problem during automatic SL processing. It is necessary to have numerous annotated video corpus in order to train statistical machine translators and study this language. Generally the annotation of SL video corpus is manually performed by linguists or computer scientists experienced in SL. However manual annotation is error-prone, unreproducible and time consuming. In addition de quality of the results depends on the SL annotators knowledge. Associating annotator knowledge to image processing techniques facilitates the annotation task increasing robustness and speeding up the required time. The goal of this research concerns on the study and development of image processing technique in order to assist the annotation of SL video corpus: body tracking, hand segmentation, temporal segmentation, gloss recognition. Along this PhD thesis we address the problem of gloss annotation of SL video corpus. First of all we intend to detect the limits corresponding to the beginning and end of a sign. This annotation method requires several low level approaches for performing temporal segmentation and for extracting motion and hand shape features. First we propose a particle filter based approach for robustly tracking hand and face robust to occlusions. Then a segmentation method for extracting hand when it is in front of the face has been developed. Motion is used for segmenting signs and later hand shape is used to improve the results. Indeed hand shape allows to delete limits detected in the middle of a sign. Once signs have been segmented we proceed to the gloss recognition using lexical description of signs. We have evaluated our algorithms using international corpus, in order to show their advantages and limitations. The evaluation has shown the robustness of the proposed methods with respect to high dynamics and numerous occlusions between body parts. Resulting annotation is independent on the annotator and represents a gain on annotation consistency
    • …
    corecore