918 research outputs found

    The Impact of Modality Expectancy on Memory Accuracy for Brand Names

    Full text link
    It is proposed that an individual’s expectations regarding the modality by which to-be-remembered brand names will be communicated in the future can impact memory accuracy for those brand names. Specifically, we hypothesize that the likelihood of malapropistic errors (i.e., false recognition of phonetically similar brand names) increases with greater attention to phonemic codes relative to orthographic codes. Attention to these memorial representations is driven by expectations as to whether retrieval will be written or spoken. When visually presented with brand names, those expecting text-based retrieval pay relatively greater attention to the visual forms or orthographies of brand names, as this information is necessary for successful written reproduction. Individuals expecting to orally communicate brand names discount the orthographic information at encoding in favor of internally generated phonetic representations (i.e., the way the brand names sound when spoken), as the spellings of the brand names are immaterial for successful spoken reproduction. The formats by which stimuli are presented (i.e., sequentially vs. simultaneously) are shown to interact in predictable ways with modality expectancies, such that mere exposure effects are maximized only when presentation format is optimal for a specific expected modality—sequentially when spoken recall is expected and simultaneously when written recall is expected. These conditions generate relatively high-quality memorial representations, which result in relative metacognitive ease of retrieval on subsequent recognition tasks. Consequently, downstream variables including purchase likelihood and willingness-to-pay for products featuring those brand names can be impacted via this perceptual fluency

    Towards multi-domain speech understanding with flexible and dynamic vocabulary

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 201-208).In developing telephone-based conversational systems, we foresee future systems capable of supporting multiple domains and flexible vocabulary. Users can pursue several topics of interest within a single telephone call, and the system is able to switch transparently among domains within a single dialog. This system is able to detect the presence of any out-of-vocabulary (OOV) words, and automatically hypothesizes each of their pronunciation, spelling and meaning. These can be confirmed with the user and the new words are subsequently incorporated into the recognizer lexicon for future use. This thesis will describe our work towards realizing such a vision, using a multi-stage architecture. Our work is focused on organizing the application of linguistic constraints in order to accommodate multiple domain topics and dynamic vocabulary at the spoken input. The philosophy is to exclusively apply below word-level linguistic knowledge at the initial stage. Such knowledge is domain-independent and general to all of the English language. Hence, this is broad enough to support any unknown words that may appear at the input, as well as input from several topic domains. At the same time, the initial pass narrows the search space for the next stage, where domain-specific knowledge that resides at the word-level or above is applied. In the second stage, we envision several parallel recognizers, each with higher order language models tailored specifically to its domain. A final decision algorithm selects a final hypothesis from the set of parallel recognizers.(cont.) Part of our contribution is the development of a novel first stage which attempts to maximize linguistic constraints, using only below word-level information. The goals are to prevent sequences of unknown words from being pruned away prematurely while maintaining performance on in-vocabulary items, as well as reducing the search space for later stages. Our solution coordinates the application of various subword level knowledge sources. The recognizer lexicon is implemented with an inventory of linguistically motivated units called morphs, which are syllables augmented with spelling and word position. This first stage is designed to output a phonetic network so that we are not committed to the initial hypotheses. This adds robustness, as later stages can propose words directly from phones. To maximize performance on the first stage, much of our focus has centered on the integration of a set of hierarchical sublexical models into this first pass. To do this, we utilize the ANGIE framework which supports a trainable context-free grammar, and is designed to acquire subword-level and phonological information statistically. Its models can generalize knowledge about word structure, learned from in-vocabulary data, to previously unseen words. We explore methods for collapsing the ANGIE models into a finite-state transducer (FST) representation which enables these complex models to be efficiently integrated into recognition. The ANGIE-FST needs to encapsulate the hierarchical knowledge of ANGIE and replicate ANGIE's ability to support previously unobserved phonetic sequences ...by Grace Chung.Ph.D

    Developing attribute acquisition strategies in spoken dialogue systems via user simulation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 159-169).A spoken dialogue system (SDS) is an application that supports conversational interaction with a human to perform some task. SDSs are emerging as an intuitive and efficient means for accessing information. A critical barrier to their widespread deployment remains in the form of communication breakdown at strategic points in the dialogue, often when the user tries to supply a named entity from a large or open vocabulary set. For example, a weather system might know several thousand cities, but there is no easy way to inform the user about what those cities are. The system will likely misrecognize any unknown city as some known city. The inability of a system to acquire an unknown value can lead to unpredictable behavior by the system, as well as by the user. This thesis presents a framework for developing attribute acquisition strategies with a simulated user. We specifically focus on the acquisition of unknown city names in a flight domain, through a spell-mode subdialogue. Collecting data from real users is costly in both time and resources. In addition, our goal is to focus on situations that tend to occur sporadically in real dialogues, depending on the domain and the user's experience in that domain.(cont.) Therefore, we chose to employ user simulation, which would allow us to generate a large number of dialogues, and to configure the input as desired in order to exercise specific strategies. We present a novel method of utterance generation for user simulation, that exploits an existing corpus of real user dialogues, but recombines the utterances using an example-based, template approach. Items of interest not in the corpus, such as foreign or unknown cities, can be included by splicing in synthesized speech. This method allows us to produce realistic utterances by retaining the structural variety of real user utterances, while introducing cities that can only be resolved via spelling. We also developed a model of generic dialogue management, allowing a developer to quickly specify interaction properties on a per-attribute basis. This model was used to assess the effectiveness of various combinations of dialogue strategies and simulated user behavior. Current approaches to user simulation typically model simulated utterances at the intention level, assuming perfect recognition and understanding. We employ speech to develop our strategies in the context of errors that occur naturally from recognition and understanding.(cont.) We use simulation to address two problems: the conflict problem requires the system to choose how to act when a new hypothesis for an attribute conflicts with its current belief, while the compliance problem requires the system to decide whether a user was compliant with a spelling request. Decision models were learned from simulated data, and were tested with real users, showing that the learned model significantly outperformed a heuristic model in choosing the "ideal" response to the conflict problem, with accuracies of 84.1% and 52.1%, respectively. The learned model to predict compliance achieved a respectable 96.3% accuracy. These results suggest that such models learned from simulated data can attain similar, if not better, performance in dialogues with real users.by Edward A. Filisko.Ph.D

    Effects of Word Type, Orthographic Type, and Word Length on Decoding and Spelling Abilities of Fourth Graders with and without Reading Impairments

    Get PDF
    The effects of word type (real, nonsense), orthographic type (phonetic, nonphonetic), and word length (1 to 5 syllables) on the decoding and spelling abilities (accuracy) of fourth-graders with and without reading impairments was investigated. This study was unique because the 23 participants were in one grade level (fourth grade) which controlled for age and reading experience. The participants, who varied in their single word decoding abilities, were separated into two reading groups, an average reading group and primary reading impairment group based on their performance on the Woodcock Reading Mastery Test-III (WRMT-III) Word Identification and Word Attack subtests. All 23 participants completed the three experimental tasks: a single word decoding task, a spelling decision task, and a written spelling task. The same stimuli, a total of 100 stimulus words, 50 real words and 50 nonsense words (word type), categorized by two orthographic types (25 phonetic, 25 nonphonetic), and five words for each of five lengths (1-5 syllables) were used in each experimental task.  Word length had a significant effect on all three experimental tasks: 1) the single word decoding task, 2) the spelling decision task, and 3) the written spelling accuracy for both reading groups. Results included relationships between decoding accuracy, spelling decision accuracy, and written spelling accuracy for the two reading groups as a function of word type, orthographic type, and word length. The decoding accuracy and spelling accuracy performance for the participants in the present study were characterized by a linear decrease in accuracy with an increase in word length. For the experimental tasks, the strongest correlations were found between the decoding and spelling accuracy for phonetic words regardless of word type (real words, nonsense words). Decoding accuracy results included a significant main effect of group, characterized by higher decoding accuracy by the average reading group for both word types compared to the reading impairment group. In the decoding accuracy, there was a significant three-way interaction for word type, orthographic type, and word length. Post hoc comparisons included higher decoding accuracy for shorter words (< 3 syllables) regardless of word type and orthographic type. Written spelling accuracy results included two significant three-way interactions for Reading Group x Word Type x Word Length and Word Type x Orthographic Type x Word Length. The average reading group accurately decoded and spelled more of the shorter words (< 3 syllables) than longer words (4 and 5 syllables) compared to the reading impairment group. Word type effects included more real words decoded and spelled accurately compared to nonsense words. Orthographic type effects included more proficient decoding and spelling of shorter real phonetic words (< 3 syllables) than real nonphonetic, nonsense phonetic and nonsense nonphonetic words, compared to words containing 4 and 5 syllables.   This study provided more detailed decoding and spelling information than current standardized assessment tools, characterized by reading group differences for word type, orthographic type, and word length. There is a need for an assessment tool that assesses both decoding and spelling accuracy and provides detailed error analysis using the same lexical/word stimuli categorized by word type, orthographic type, and word length for children with suspected reading impairment. Decoding and spelling accuracy measures are vital for the provision of detailed differential diagnoses and subtyping of reading impairments and spelling deficits. This detailed decoding and spelling data will also provide information critical for the provision of client-specific intervention.  Ph.D

    Linguistically-motivated sub-word modeling with applications to speech recognition

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Includes bibliographical references (p. 173-185).Despite the proliferation of speech-enabled applications and devices, speech-driven human-machine interaction still faces several challenges. One of theses issues is the new word or the out-of-vocabulary (OOV) problem, which occurs when the underlying automatic speech recognizer (ASR) encounters a word it does not "know". With ASR being deployed in constantly evolving domains such as restaurant ratings, or music querying, as well as on handheld devices, the new word problem continues to arise.This thesis is concerned with the OOV problem, and in particular with the process of modeling and learning the lexical properties of an OOV word through a linguistically-motivated sub-syllabic model. The linguistic model is designed using a context-free grammar which describes the sub-syllabic structure of English words, and encapsulates phonotactic and phonological constraints. The context-free grammar is supported by a probability model, which captures the statistics of the parses generated by the grammar and encodes spatio-temporal context. The two main outcomes of the grammar design are: (1) sub-word units, which encode pronunciation information, and can be viewed as clusters of phonemes; and (2) a high-quality alignment between graphemic and sub-word units, which results in hybrid entities denoted as spellnemes. The spellneme units are used in the design of a statistical bi-directional letter-to-sound (L2S) model, which plays a significant role in automatically learning the spelling and pronunciation of a new word.The sub-word units and the L2S model are assessed on the task of automatic lexicon generation. In a first set of experiments, knowledge of the spelling of the lexicon is assumed. It is shown that the phonemic pronunciations associated with the lexicon can be successfully learned using the L2S model as well as a sub-word recognizer.(cont.) In a second set of experiments, the assumption of perfect spelling knowledge is relaxed, and an iterative and unsupervised algorithm, denoted as Turbo-style, makes use of spoken instances of both spellings and words to learn the lexical entries in a dictionary.Sub-word speech recognition is also embedded in a parallel fashion as a backoff mechanism for a word recognizer. The resulting hybrid model is evaluated in a lexical access application, whereby a word recognizer first attempts to recognize an isolated word. Upon failure of the word recognizer, the sub-word recognizer is manually triggered. Preliminary results show that such a hybrid set-up outperforms a large-vocabulary recognizer.Finally, the sub-word units are embedded in a flat hybrid OOV model for continuous ASR. The hybrid ASR is deployed as a front-end to a song retrieval application, which is queried via spoken lyrics. Vocabulary compression and open-ended query recognition are achieved by designing a hybrid ASR. The performance of the frontend recognition system is reported in terms of sentence, word, and sub-word error rates. The hybrid ASR is shown to outperform a word-only system over a range of out-of-vocabulary rates (1%-50%). The retrieval performance is thoroughly assessed as a fmnction of ASR N-best size, language model order, and the index size. Moreover, it is shown that the sub-words outperform alternative linguistically-motivated sub-lexical units such as phonemes. Finally, it is observed that a dramatic vocabulary compression - by more than a factor of 10 - is accompanied by a minor loss in song retrieval performance.by Ghinwa F. Choueiter.Ph.D

    Scientific Research, Writing, and Dissemination (Part 3/4): Scientific Writing

    Get PDF
    This is the-third-paper, in-tetrology on the Scientific Research, Writing and Dissemination. Writing is a-universal-type of formal-scientific-communication, and yet, academics/researchers/scientists have a-rather dreadful-reputation, for being un-interesting, monotonous and, even, pathetically ‘dry’ writers. One-reason, behind-that, could-be, that majority of scientists are not, really, trained-writers. Moreover, pressure to-publish, poorly-prepared-manuscripts, and multiple-rejections, by-various-journals, dampen the-spirits of untrained-academic-writers, resulting in their-reduced-productivity. Scientific-style-writing may be ‘thorny’, in the-beginning, for ‘greenhorn’-writers, but clear-communication and concise-writing, can-be-trained. The-main-objective of this-paper is to-offer early-stage-researchers (beginner researchers and scientific writing-apprentices) easy-applicable, yet, theoretically-insightful-introduction, to-the structural-components of-a-scientific-paper and basic-writing-guidelines. The-seasoned-writers will-also find few-interesting revelations and ‘food-for-taught’. This-paper focuses on-scientific-writing (mainly for peer-reviewed publication) and largely presumes no explicit-disciplinary perspective, however, some-emphasis on-engineering-research, is given. The-main-instruments applied in this-study were: a-survey and a document-analysis. The-respondents identified, that almost-every-section of a-scientific-paper, is challenging, for them, although to a-different-extent. Majority (64%) indicated that they have-experienced rejections, in-their-publishing-endeavors, while the-rest said, that all-their-submissions, for-review, were successful. Out of those, experienced rejection, 57% stated, that, they usually re-submit, their-manuscript, to a-different-journal, after improving or correcting it, while 43 % preferred to-do nothing, after the rejection. 55% also confessed that they: (1) are not very-confident, in their-ability, to-write (for scientific- publication) in-English, and (2) do-not-know exactly what constitutes a-good-research-paper and fine scientific-writing. 36% stated that they are not so-sure about the-proper-structure of a-scientific-paper. The-study also-revealed some-signs of Dunning-Kruger Effect, in-writing, particularly, among-younger faculty. To-address the-findings of the-research, and to-give a-multifaceted-perspective, on-the scientific-writing, the-paper, in-addition, presents a-fusion of guiding-principles, found in-literature, and supplemented by the-author’ input, about structuring and writing a-scientific-paper. In-particular, the-following was elaborated on: Misconceptions about scientific-writing; Expanded ‘Hourglass-Model’, based on the-IMRaD-format; Micro-issues of writing (grammar and punctuations); How to-deal with-rejection of a-manuscript; English as de facto language of scientific-communication; Characteristics of good-scientific-paper and writing-style; and Establishing one’s unique-voice, in-scientific-writing, among-others. The-study is important; in making a-contribution (in-its-small-way) to-the-body of knowledge, on-the-subject-matter, and it-is-potentially-beneficial, to-scientific-writers, at any-stage, of their-research and scientific-writing- career. Keywords: scholarly article, paper structure; journal publications, English, rejection, hyphen

    Familiarity effects in visual word recognition

    Get PDF

    An Evaluation Study of the Effects of the PIRK Reading Program on Reading and Learning Development in Learning-Disabled Students

    Get PDF
    An emerging view on learning disabilities is that failure in learning to read in the early grades results in continuing failure in school, along with cognitive and social/emotional dysfunctions: Educational leaders have called for reading programs that are maximally effective and minimally time-consuming, and are suited to the needs of our particular students. Leaders in the field of special education stressed a need for the prevention of failure. This evaluation study examined how the PIRK reading program components fit with the current literature on teaching reading and language arts to all children and to LD children. This study investigated the effects of the PIRK reading program on LD students in the early grades and elicited teachers\u27 perceptions of the effects of PIRK on LD students\u27 academic-related classroom behaviors. The subjects were 14 LD students using PIRK and 13 LD controls not using PIRK in resource rooms in Texas. Scores on tests of reading, spelling, writing, and listening were compared. The results of the data analysis indicated that students using the primary PIRK outperformed the controls in word knowledge and the intermediate group using the revised PIRK outperformed the controls in listening comprehension. A resource teacher reported that LD students\u27 test scores in reading rose rapidly when they began using the new primary PIRK. Students were able to achieve 90% to 100% accuracy in decoding, in both the primary and the revised PIRK, and 75% to 85% accuracy in the upper revised PIRK levels. Teachers also reported that PIRK had positive effects on student academic-related classroom behaviors. The current literature on theory, research, and practice supported an approach like PIRK and the PIRK components as effective for teaching phonics, decoding, word knowledge, and beginning reading to children and specifically to LD children. Information from this evaluation study indicated a need to reduce the difficulty in the upper levels of the revised PIRK. There is a need for more word meanings in sentences, in stories, and in nursery rhymes, along with comprehension strategies beginning in third grade
    • …
    corecore