Search CORE

161,360 research outputs found

Visual Word Ambiguity

Author: Arnold W M Smeulders
Cor J Veenman
Jan C van Gemert
Jan-Mark Geusebroek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

A Mouth Full of Words: Visually Consistent Acoustic Redubbing

Author: Matthews Iain
Taylor Sarah
Theobald Barry-John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/08/2015
Field of study

This paper introduces a method for automatic redubbing of video that exploits the many-to-many mapping of phoneme sequences to lip movements modelled as dynamic visemes [1]. For a given utterance, the corresponding dynamic viseme sequence is sampled to construct a graph of possible phoneme sequences that synchronize with the video. When composed with a pronunciation dictionary and language model, this produces a vast number of word sequences that are in sync with the original video, literally putting plausible words into the mouth of the speaker. We demonstrate that traditional, one-to-many, static visemes lack flexibility for this application as they produce significantly fewer word sequences. This work explores the natural ambiguity in visual speech and offers insight for automatic speech recognition and the importance of language modeling

University of East Anglia digital repository

Perceptual biases and positive schizotypy: The role of perceptual load

Author: Tsakanikos Elias
Publication venue
Publication date: 01/01/2006
Field of study

The study investigated the eﬀects of perceptual load on the bias to report seeing non-existing events—a bias associated with positive symptoms of schizophrenia and positive schizotypal symptoms. Undergraduate students completed psychometric measures of schizotypy and were asked to detect fast moving words among non-words under diﬀerent levels of perceptual load. Perceptual load was manipulated through stimulus motion. Overall, the results showed that the higher the perceptual load, the stronger the bias to report seeing words in non-word trials. However, the observed bias was associated with positive schizotypy (Unusual Experiences) only when visual detection was performed under conditions of medium perceptual load. \ud No schizotypy measure was associated with accuracy. The results suggest that, although some amount of perceptual ambiguity seems to be necessary for schizotypal bias generation, an increase in the perceptual load can inhibit this process possibly by preventing perception of task-irrelevant internal events, such as loose word associations. \ud \u

Roehampton University Research Repository

CogPrints Cognitive Sciences Eprint Archive

Lexical access in the processing of word boundary ambiguity

Author: Maciuszek Józef
Publication venue: 'Leibniz-Institute for Psychology Information (ZPID)'
Publication date: 01/01/2018
Field of study

Language ambiguity results from, among other things, the vagueness of the syntactic structure of phrases and whole sentences. Numerous types of syntactic ambiguity are associated with the placement of the phrase boundary. A special case of the segmentation problem is the phenomenon of word boundary ambiguities; in spoken natural language words coalesce, making it possible to interpret them in different ways (e.g., a name vs. an aim). The purpose of the study was to verify whether the two meanings of words with boundary ambiguities are activated, or whether it is a case of semantic context priming. The study was carried out using the cross-modality semantic priming paradigm. Sentences containing phrases with word boundary ambiguities were presented in an auditory manner to the participants. Immediately after, they performed a visual lexical decision task. Results indicate that both meanings of the ambiguity are automatically activated - independently of the semantic context. When discussing the results I refer to the autonomous and interactive models of parsing, and show other possible areas of research concerning word boundary ambiguities

ZENODO

Directory of Open Access Journals

Jagiellonian Univeristy Repository

Feature fusion at the local region using localized maximum-margin learning for scene categorization

Author: Qin J
Yung NHC
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

In the field of visual recognition such as scene categorization, representing an image based on the local feature (e.g., the bag-of-visual-word (BOVW) model and the bag-of-contextual-visual-word (BOCVW) model) has become popular and one of the most successful methods. In this paper, we propose a method that uses localized maximum-margin learning to fuse different types of features during the BOCVW modeling for eventual scene classification. The proposed method fuses multiple features at the stage when the best contextual visual word is selected to represent a local region (hard assignment) or the probabilities of the candidate contextual visual words used to represent the unknown region are estimated (soft assignment). The merits of the proposed method are that (1) errors caused by the ambiguity of single feature when assigning local regions to the contextual visual words can be corrected or the probabilities of the candidate contextual visual words used to represent the region can be estimated more accurately; and that (2) it offers a more flexible way in fusing these features through determining the similarity-metric locally by localized maximum-margin learning. The proposed method has been evaluated experimentally and the results indicate its effectiveness. © 2011 Elsevier Ltd All rights reserved.postprin

HKU Scholars Hub

Revisiting lexical ambiguity effects in visual word recognition

Author: Mancuso Azzurra
Publication venue: Universita degli studi di Salerno
Publication date: 07/05/2014
Field of study

2012 - 2013The aim of this work is to focus on how lexically ambiguous words are represented in the mental lexicon of speakers. The existence of words with multiple meanings/senses (e.g., credenza, mora, etc. in Italian) is a pervasive feature of natural language. Routinely speakers of almost all languages encounter ambiguous words, whose correct interpretation is made by recurring to the linguistic context in which these forms are inserted... [edited by author]XII n.s

EleA@UniSA - Università degli Studi di Salerno

Lexical Ambiguity in Nouns: Frequency Dominance and Declensional Classes

Author: Laudanna Alessandro
Mancuso Azzurra
Publication venue: EUT Edizioni Università di Trieste
Publication date: 01/01/2014
Field of study

The existence of differences in lexical processing between ambiguous and unambiguous words is still controversial. Many factors seem to play a role in determining different ambiguity effects in word recognition, such as ambiguity type, experimental paradigm, frequency dominance, etc. The aim of this study is to investigate the role played by frequency dominance and declensional class in recognizing Italian homonymous nouns, namely, forms with multiple unrelated meanings. We report the results of two visual lexical decision experiments, in which these factors are manipulated. An ambiguity disadvantage effect is found for words belonging to two different declensional classes (Exp. 2, e.g., conte), while an absence of processing differences is reported for ambiguous words within the same declensional class (Exp. 1, e.g., credenza). Moreover, an interaction between condition and frequency is found: the inhibitory effects are stronger for ambiguous nouns with two frequency-balanced meanings than for ambiguous nouns with a strongly dominant meaning. The results are compatible with the idea that several factors should be taken into account in order to disentangle competing accounts of lexical ambiguity processing. We discuss these results in terms of how variables such as frequency dominance and declensional class affect the activation of lexical representations and play a role in determining different ambiguity effects in lexical acces

OpenstarTs

Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models

Author: Abugabal Omar
Eraqi Hesham M.
Mabrouk Hadeel
Sakr Nourhan
Publication venue
Publication date: 05/06/2022
Field of study

In this work, we propose a technique to transfer speech recognition capabilities from audio speech recognition systems to visual speech recognizers, where our goal is to utilize audio data during lipreading model training. Impressive progress in the domain of speech recognition has been exhibited by audio and audio-visual systems. Nevertheless, there is still much to be explored with regards to visual speech recognition systems due to the visual ambiguity of some phonemes. To this end, the development of visual speech recognition models is crucial given the instability of audio models. The main contributions of this work are i) building on recent state-of-the-art word-based lipreading models by integrating sequence-level and frame-level Knowledge Distillation (KD) to their systems; ii) leveraging audio data during training visual models, a feat which has not been utilized in prior word-based work; iii) proposing the Gaussian-shaped averaging in frame-level KD, as an efficient technique that aids the model in distilling knowledge at the sequence model encoder. This work proposes a novel and competitive architecture for lip-reading, as we demonstrate a noticeable improvement in performance, setting a new benchmark equals to 88.64% on the LRW dataset.Comment: arXiv admin note: text overlap with arXiv:2108.0354

arXiv.org e-Print Archive