5,307 research outputs found

    First impressions: A survey on vision-based apparent personality trait analysis

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft

    Fusion of Multimodal Information in Music Content Analysis

    Get PDF
    Music is often processed through its acoustic realization. This is restrictive in the sense that music is clearly a highly multimodal concept where various types of heterogeneous information can be associated to a given piece of music (a musical score, musicians\u27 gestures, lyrics, user-generated metadata, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to "multimodal music analysis" studies. This article gives a synthetic overview of methods that have been successfully employed in multimodal signal analysis. In particular, their use in music content processing is discussed in more details through five case studies that highlight different multimodal integration techniques. The case studies include an example of cross-modal correlation for music video analysis, an audiovisual drum transcription system, a description of the concept of informed source separation, a discussion of multimodal dance-scene analysis, and an example of user-interactive music analysis. In the light of these case studies, some perspectives of multimodality in music processing are finally suggested

    Audiovisual correspondences in Sergei Eisenstein’s Alexander Nevsky: a case study in viewer attention

    Get PDF
    Cognitive film theory is an approach to analyzing film that bridges the traditionally segregated disciplines of film theory, philosophy and the psychological and neurosciences. Considerable work has already been presented from the perspective of film theory that utilizes existing empirical evidence of psychological phenomenon to inform our understanding of film viewers and the form of film itself. But can empirical psychology also provide ways to directly test the insights generated by the theoretical study of film? In this chapter I will present a case study in which eye-tracking is used to validate Russian film director Sergei Eisenstein’s intuitions about viewer attention during a sequence from Alexander Nevsky (1938

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    Music - Media - History

    Get PDF
    Music and sound shape the emotional content of audio-visual media and carry different meanings. This volume considers audio-visual material as a primary source for historiography. By analyzing how the same sounds are used in different media contexts at different times, the contributors intend to challenge the linear perspective of (music) history based on canonic authority. The book discusses AV-Documents (analysis in context), methodological questions (implications for research, education, and popularization of knowledge), archives of cultural memory (from the perspective of Cultural Studies) as well as digitalization and its consequences (organization of knowledge)

    Music - Media - History: Re-Thinking Musicology in an Age of Digital Media

    Get PDF
    Music and sound shape the emotional content of audio-visual media and carry different meanings. This volume considers audio-visual material as a primary source for historiography. By analyzing how the same sounds are used in different media contexts at different times, the contributors intend to challenge the linear perspective of (music) history based on canonic authority. The book discusses AV-Documents (analysis in context), methodological questions (implications for research, education, and popularization of knowledge), archives of cultural memory (from the perspective of Cultural Studies) as well as digitalization and its consequences (organization of knowledge)

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Neural oscillatory signatures of auditory and audiovisual illusions

    Get PDF
    Questions of the relationship between human perception and brain activity can be approached from different perspectives: in the first, the brain is mainly regarded as a recipient and processor of sensory data. The corresponding research objective is to establish mappings of neural activity patterns and external stimuli. Alternatively, the brain can be regarded as a self-organized dynamical system, whose constantly changing state affects how incoming sensory signals are processed and perceived. The research reported in this thesis can chiefly be located in the second framework, and investigates the relationship between oscillatory brain activity and the perception of ambiguous stimuli. Oscillations are here considered as a mechanism for the formation of transient neural assemblies, which allows efficient information transfer. While the relevance of activity in distinct frequency bands for auditory and audiovisual perception is well established, different functional architectures of sensory integration can be derived from the literature. This dissertation therefore aims to further clarify the role of oscillatory activity in the integration of sensory signals towards unified perceptual objects, using illusion paradigms as tools of study. In study 1, we investigate the role of low frequency power modulations and phase alignment in auditory object formation. We provide evidence that auditory restoration is associated with a power reduction, while the registration of an additional object is reflected by an increase in phase locking. In study 2, we analyze oscillatory power as a predictor of auditory influence on visual perception in the sound-induced flash illusion. We find that increased beta-/ gamma-band power over occipitotemporal electrodes shortly before stimulus onset predicts the illusion, suggesting a facilitation of processing in polymodal circuits. In study 3, we address the question of whether visual influence on auditory perception in the ventriloquist illusion is reflected in primary sensory or higher-order areas. We establish an association between reduced theta-band power in mediofrontal areas and the occurrence of illusion, which indicates a top-down influence on sensory decision-making. These findings broaden our understanding of the functional relevance of neural oscillations by showing that different processing modes, which are reflected in specific spatiotemporal activity patterns, operate in different instances of sensory integration.Fragen nach dem Zusammenhang zwischen menschlicher Wahrnehmung und Hirnaktivität können aus verschiedenen Perspektiven adressiert werden: in der einen wird das Gehirn hauptsächlich als Empfänger und Verarbeiter von sensorischen Daten angesehen. Das entsprechende Forschungsziel wäre eine Zuordnung von neuronalen Aktivitätsmustern zu externen Reizen. Dieser Sichtweise gegenüber steht ein Ansatz, der das Gehirn als selbstorganisiertes dynamisches System begreift, dessen sich ständig verändernder Zustand die Verarbeitung und Wahrnehmung von sensorischen Signalen beeinflusst. Die Arbeiten, die in dieser Dissertation zusammengefasst sind, können vor allem in der zweitgenannten Forschungsrichtung verortet werden, und untersuchen den Zusammenhang zwischen oszillatorischer Hirnaktivität und der Wahrnehmung von mehrdeutigen Stimuli. Oszillationen werden hier als ein Mechanismus für die Formation von transienten neuronalen Zusammenschlüssen angesehen, der effizienten Informationstransfer ermöglicht. Obwohl die Relevanz von Aktivität in verschiedenen Frequenzbändern für auditorische und audiovisuelle Wahrnehmung gut belegt ist, können verschiedene funktionelle Architekturen der sensorischen Integration aus der Literatur abgeleitet werden. Das Ziel dieser Dissertation ist deshalb eine Präzisierung der Rolle oszillatorischer Aktivität bei der Integration von sensorischen Signalen zu einheitlichen Wahrnehmungsobjekten mittels der Nutzung von Illusionsparadigmen. In der ersten Studie untersuchen wir die Rolle von Leistung und Phasenanpassung in niedrigen Frequenzbändern bei der Formation von auditorischen Objekten. Wir zeigen, dass die Wiederherstellung von Tönen mit einer Reduktion der Leistung zusammenhängt, während die Registrierung eines zusätzlichen Objekts durch einen erhöhten Phasenangleich widergespiegelt wird. In der zweiten Studie analysieren wir oszillatorische Leistung als Prädiktor von auditorischem Einfluss auf visuelle Wahrnehmung in der sound-induced flash illusion. Wir stellen fest, dass erhöhte Beta-/Gamma-Band Leistung über occipitotemporalen Elektroden kurz vor der Reizdarbietung das Auftreten der Illusion vorhersagt, was auf eine Begünstigung der Verarbeitung in polymodalen Arealen hinweist. In der dritten Studie widmen wir uns der Frage, ob ein visueller Einfluss auf auditorische Wahrnehmung in der ventriloquist illusion sich in primären sensorischen oder übergeordneten Arealen widerspiegelt. Wir weisen einen Zusammenhang von reduzierter Theta-Band Leistung in mediofrontalen Arealen und dem Auftreten der Illusion nach, was einen top-down Einfluss auf sensorische Entscheidungsprozesse anzeigt. Diese Befunde erweitern unser Verständnis der funktionellen Bedeutung neuronaler Oszillationen, indem sie aufzeigen, dass verschiedene Verarbeitungsmodi, die sich in spezifischen räumlich-zeitlichen Aktivitätsmustern spiegeln, in verschiedenen Phänomenen von sensorischer Integration wirksam sind
    corecore