51,629 research outputs found

    Multimodal Content Analysis for Effective Advertisements on YouTube

    Full text link
    The rapid advances in e-commerce and Web 2.0 technologies have greatly increased the impact of commercial advertisements on the general public. As a key enabling technology, a multitude of recommender systems exists which analyzes user features and browsing patterns to recommend appealing advertisements to users. In this work, we seek to study the characteristics or attributes that characterize an effective advertisement and recommend a useful set of features to aid the designing and production processes of commercial advertisements. We analyze the temporal patterns from multimedia content of advertisement videos including auditory, visual and textual components, and study their individual roles and synergies in the success of an advertisement. The objective of this work is then to measure the effectiveness of an advertisement, and to recommend a useful set of features to advertisement designers to make it more successful and approachable to users. Our proposed framework employs the signal processing technique of cross modality feature learning where data streams from different components are employed to train separate neural network models and are then fused together to learn a shared representation. Subsequently, a neural network model trained on this joint feature embedding representation is utilized as a classifier to predict advertisement effectiveness. We validate our approach using subjective ratings from a dedicated user study, the sentiment strength of online viewer comments, and a viewer opinion metric of the ratio of the Likes and Views received by each advertisement from an online platform.Comment: 11 pages, 5 figures, ICDM 201

    Cross-modal cue effects in motion processing

    Full text link
    The everyday environment brings to our sensory systems competing inputs from different modalities. The ability to filter these multisensory inputs in order to identify and efficiently utilize useful spatial cues is necessary to detect and process the relevant information. In the present study, we investigate how feature-based attention affects the detection of motion across sensory modalities. We were interested to determine how subjects use intramodal, cross-modal auditory, and combined audiovisual motion cues to attend to specific visual motion signals. The results showed that in most cases, both the visual and the auditory cues enhance feature-based orienting to a transparent visual motion pattern presented among distractor motion patterns. Whereas previous studies have shown cross-modal effects of spatial attention, our results demonstrate a spread of cross-modal feature-based attention cues, which have been matched for the detection threshold of the visual target. These effects were very robust in comparisons of the effects of valid vs. invalid cues, as well as in comparisons between cued and uncued valid trials. The effect of intramodal visual, cross-modal auditory, and bimodal cues also increased as a function of motion-cue salience. Our results suggest that orienting to visual motion patterns among distracters can be facilitated not only by intramodal priors, but also by feature-based cross-modal information from the auditory system.First author draf

    Ecological IVIS design : using EID to develop a novel in-vehicle information system

    Get PDF
    New in-vehicle information systems (IVIS) are emerging which purport to encourage more environment friendly or ‘green’ driving. Meanwhile, wider concerns about road safety and in-car distractions remain. The ‘Foot-LITE’ project is an effort to balance these issues, aimed at achieving safer and greener driving through real-time driving information, presented via an in-vehicle interface which facilitates the desired behaviours while avoiding negative consequences. One way of achieving this is to use ecological interface design (EID) techniques. This article presents part of the formative human-centred design process for developing the in-car display through a series of rapid prototyping studies comparing EID against conventional interface design principles. We focus primarily on the visual display, although some development of an ecological auditory display is also presented. The results of feedback from potential users as well as subject matter experts are discussed with respect to implications for future interface design in this field

    A dual role for prediction error in associative learning

    Get PDF
    Confronted with a rich sensory environment, the brain must learn statistical regularities across sensory domains to construct causal models of the world. Here, we used functional magnetic resonance imaging and dynamic causal modeling (DCM) to furnish neurophysiological evidence that statistical associations are learnt, even when task-irrelevant. Subjects performed an audio-visual target-detection task while being exposed to distractor stimuli. Unknown to them, auditory distractors predicted the presence or absence of subsequent visual distractors. We modeled incidental learning of these associations using a Rescorla--Wagner (RW) model. Activity in primary visual cortex and putamen reflected learning-dependent surprise: these areas responded progressively more to unpredicted, and progressively less to predicted visual stimuli. Critically, this prediction-error response was observed even when the absence of a visual stimulus was surprising. We investigated the underlying mechanism by embedding the RW model into a DCM to show that auditory to visual connectivity changed significantly over time as a function of prediction error. Thus, consistent with predictive coding models of perception, associative learning is mediated by prediction-error dependent changes in connectivity. These results posit a dual role for prediction-error in encoding surprise and driving associative plasticity

    Action-based effects on music perception

    Get PDF
    The classical, disembodied approach to music cognition conceptualizes action and perception as separate, peripheral processes. In contrast, embodied accounts of music cognition emphasize the central role of the close coupling of action and perception. It is a commonly established fact that perception spurs action tendencies. We present a theoretical framework that captures the ways in which the human motor system and its actions can reciprocally influence the perception of music. The cornerstone of this framework is the common coding theory, postulating a representational overlap in the brain between the planning, the execution, and the perception of movement. The integration of action and perception in so-called internal models is explained as a result of associative learning processes. Characteristic of internal models is that they allow intended or perceived sensory states to be transferred into corresponding motor commands (inverse modeling), and vice versa, to predict the sensory outcomes of planned actions (forward modeling). Embodied accounts typically refer to inverse modeling to explain action effects on music perception (Leman, 2007). We extend this account by pinpointing forward modeling as an alternative mechanism by which action can modulate perception. We provide an extensive overview of recent empirical evidence in support of this idea. Additionally, we demonstrate that motor dysfunctions can cause perceptual disabilities, supporting the main idea of the paper that the human motor system plays a functional role in auditory perception. The finding that music perception is shaped by the human motor system and its actions suggests that the musical mind is highly embodied. However, we advocate for a more radical approach to embodied (music) cognition in the sense that it needs to be considered as a dynamical process, in which aspects of action, perception, introspection, and social interaction are of crucial importance

    Aerospace Medicine and Biology: A continuing bibliography with indexes (supplement 314)

    Get PDF
    This bibliography lists 139 reports, articles, and other documents introduced into the NASA scientific and technical information system in August, 1988

    True zero-training brain-computer interfacing: an online study

    Get PDF
    Despite several approaches to realize subject-to-subject transfer of pre-trained classifiers, the full performance of a Brain-Computer Interface (BCI) for a novel user can only be reached by presenting the BCI system with data from the novel user. In typical state-of-the-art BCI systems with a supervised classifier, the labeled data is collected during a calibration recording, in which the user is asked to perform a specific task. Based on the known labels of this recording, the BCI's classifier can learn to decode the individual's brain signals. Unfortunately, this calibration recording consumes valuable time. Furthermore, it is unproductive with respect to the final BCI application, e.g. text entry. Therefore, the calibration period must be reduced to a minimum, which is especially important for patients with a limited concentration ability. The main contribution of this manuscript is an online study on unsupervised learning in an auditory event-related potential (ERP) paradigm. Our results demonstrate that the calibration recording can be bypassed by utilizing an unsupervised trained classifier, that is initialized randomly and updated during usage. Initially, the unsupervised classifier tends to make decoding mistakes, as the classifier might not have seen enough data to build a reliable model. Using a constant re-analysis of the previously spelled symbols, these initially misspelled symbols can be rectified posthoc when the classifier has learned to decode the signals. We compare the spelling performance of our unsupervised approach and of the unsupervised posthoc approach to the standard supervised calibration-based dogma for n = 10 healthy users. To assess the learning behavior of our approach, it is unsupervised trained from scratch three times per user. Even with the relatively low SNR of an auditory ERP paradigm, the results show that after a limited number of trials (30 trials), the unsupervised approach performs comparably to a classic supervised model

    Neural connectivity in syntactic movement processing

    Get PDF
    Linguistic theory suggests non-canonical sentences subvert the dominant agent-verb-theme order in English via displacement of sentence constituents to argument (NP-movement) or non-argument positions (wh-movement). Both processes have been associated with the left inferior frontal gyrus and posterior superior temporal gyrus, but differences in neural activity and connectivity between movement types have not been investigated. In the current study, functional magnetic resonance imaging data were acquired from 21 adult participants during an auditory sentence-picture verification task using passive and active sentences contrasted to isolate NP-movement, and object- and subject-cleft sentences contrasted to isolate wh-movement. Then, functional magnetic resonance imaging data from regions common to both movement types were entered into a dynamic causal modeling analysis to examine effective connectivity for wh-movement and NP-movement. Results showed greater left inferior frontal gyrus activation for Wh > NP-movement, but no activation for NP > Wh-movement. Both types of movement elicited activity in the opercular part of the left inferior frontal gyrus, left posterior superior temporal gyrus, and left medial superior frontal gyrus. The dynamic causal modeling analyses indicated that neither movement type significantly modulated the connection from the left inferior frontal gyrus to the left posterior superior temporal gyrus, nor vice-versa, suggesting no connectivity differences between wh- and NP-movement. These findings support the idea that increased complexity of wh-structures, compared to sentences with NP-movement, requires greater engagement of cognitive resources via increased neural activity in the left inferior frontal gyrus, but both movement types engage similar neural networks.This work was supported by the NIH-NIDCD, Clinical Research Center Grant, P50DC012283 (PI: CT), and the Graduate Research Grant and School of Communication Graduate Ignition Grant from Northwestern University (awarded to EE). (P50DC012283 - NIH-NIDCD, Clinical Research Center Grant; Graduate Research Grant and School of Communication Graduate Ignition Grant from Northwestern University)Published versio
    • 

    corecore