279 research outputs found


    Get PDF
    The trouble with traditional banking system service resulted difficulties, latency and low quality of service, not suitable for disable people and require extra manpower to perform simple bank activities. The goal of this project is to build a voice recognition based system which specifies on the banking activities element and specializes in using voice as a medium to run bank activities via telephony network system. Three fundamental objectives were addressed in the study. First, to develop two-way interactive program of banking system, which use voice as importantmechanism to receive instruction and response to user. Second, it support to first objective which to develop such a user friendly andhighsecurity voice banking system which requires the user first logs on to the system by furnishing the assigned customer identification number and personal identification number before user proceed for further actions. And therefore, there must have a strong database structure development of the application in the voice banking system that purposely to maintain the integrity of the data stored and responds to authorized user only. For third objective, is to determine the best programming in order to implement in telephony network system. There is a study and architecture on how voice can be accepted, manipulated and generated by using combination two types of programming which are Cold Fusion and VoiceXML, which is goes to the third objective. The functions of this system is proved and demanded by user as it provides such convenience and easy services with just use voice to transmit the instruction. Hence, this strategy will grab large number of customers and simultaneously will generate huge profit too to the bank institution that applies this system. It is hoping that, by developing this system it will be a platform for next developer to host the system and can be use a large number of customers simultaneously and efficiently. Keyword: Voice based, telephony, combination of programming, architectur

    Transcoding multilingual and non-standard web content to voiceXML

    Get PDF
    Includes abstract.Includes bibliographical references (leaves 112-119).Transcoding systems redesign and reformat already existing web interfaces into other formats so that they can be available to other audiences. For example, change it into audio, sign language or other medium. The bene_t of such systems is less work on meeting the needs of di_erent audiences. This thesis describes the design and the implementation details of a transcoding system called Dinaco. Dinaco is targeted at converting HTML web pages which are created using Extensible MarkupLanguage (XML) technologies to speech interfaces. The di_erentiating feature ofDinaco is that it uses separated annotations during its transcoding process, while previous transcoding systems use HTML dependent annotations. These separated annotations enable Dinaco to pre-normalize non-standard words and to generate VoiceXML interfaces which have semantics of content. The semantics help Textto-Speech (TTS) tools to read multilingual text and to do text normalization. The results from experiments indicate that pre-normalizing non-standard words and appending semantics enable Dinaco to generate VoiceXML interfaces which are more usable than those which are generated by transcoding systems which use HTML dependent annotations. The thesis uses the design of Dinaco to demonstrate how separating annotations makes it possible to write descriptions of content which cannot be written using external HTML dependent annotations and how separating annotations makes it easy to write, maintain, re-use and share annotations

    Building and Designing Expressive Speech Synthesis

    Get PDF
    We know there is something special about speech. Our voices are not just a means of communicating. They also give a deep impression of who we are and what we might know. They can betray our upbringing, our emotional state, our state of health. They can be used to persuade and convince, to calm and to excite. As speech systems enter the social domain they are required to interact, support and mediate our social relationships with 1) each other, 2) with digital information, and, increasingly, 3) with AI-based algorithms and processes. Socially Interactive Agents (SIAs) are at the fore- front of research and innovation in this area. There is an assumption that in the future “spoken language will provide a natural conversational interface between human beings and so-called intelligent systems.” [Moore 2017, p. 283]. A considerable amount of previous research work has tested this assumption with mixed results. However, as pointed out “voice interfaces have become notorious for fostering frustration and failure” [Nass and Brave 2005, p.6]. It is within this context, between our exceptional and intelligent human use of speech to communicate and interact with other humans, and our desire to leverage this means of communication for artificial systems, that the technology, often termed expressive speech synthesis uncomfortably falls. Uncomfortably, because it is often overshadowed by issues in interactivity and the underlying intelligence of the system which is something that emerges from the interaction of many of the components in a SIA. This is especially true of what we might term conversational speech, where decoupling how things are spoken, from when and to whom they are spoken, can seem an impossible task. This is an even greater challenge in evaluation and in characterising full systems which have made use of expressive speech. Furthermore when designing an interaction with a SIA, we must not only consider how SIAs should speak but how much, and whether they should even speak at all. These considerations cannot be ignored. Any speech synthesis that is used in the context of an artificial agent will have a perceived accent, a vocal style, an underlying emotion and an intonational model. Dimensions like accent and personality (cross speaker parameters) as well as vocal style, emotion and intonation during an interaction (within-speaker parameters) need to be built in the design of a synthetic voice. Even a default or neutral voice has to consider these same expressive speech synthesis components. Such design parameters have a strong influence on how effectively a system will interact, how it is perceived and its assumed ability to perform a task or function. To ignore these is to blindly accept a set of design decisions that ignores the complex effect speech has on the user’s successful interaction with a system. Thus expressive speech synthesis is a key design component in SIAs. This chapter explores the world of expressive speech synthesis, aiming to act as a starting point for those interested in the design, building and evaluation of such artificial speech. The debates and literature within this topic are vast and are fundamentally multidisciplinary in focus, covering a wide range of disciplines such as linguistics, pragmatics, psychology, speech and language technology, robotics and human-computer interaction (HCI), to name a few. It is not our aim to synthesise these areas but to give a scaffold and a starting point for the reader by exploring the critical dimensions and decisions they may need to consider when choosing to use expressive speech. To do this, the chapter explores the building of expressive synthesis, highlighting key decisions and parameters as well as emphasising future challenges in expressive speech research and development. Yet, before these are expanded upon we must first try and define what we actually mean by expressive speech


    Get PDF
    The trouble with traditional banking system service resulted difficulties, latency and low quality of service, not suitable for disable people and require extra manpower to perform simple bank activities. The goal of this project is to build a voice recognition based system which specifies on the banking activities element and specializes in using voice as a medium to run bank activities via telephony network system. Three fundamental objectives were addressed in the study. First, to develop two-way interactive program of banking system, which use voice as importantmechanism to receive instruction and response to user. Second, it support to first objective which to develop such a user friendly andhighsecurity voice banking system which requires the user first logs on to the system by furnishing the assigned customer identification number and personal identification number before user proceed for further actions. And therefore, there must have a strong database structure development of the application in the voice banking system that purposely to maintain the integrity of the data stored and responds to authorized user only. For third objective, is to determine the best programming in order to implement in telephony network system. There is a study and architecture on how voice can be accepted, manipulated and generated by using combination two types of programming which are Cold Fusion and VoiceXML, which is goes to the third objective. The functions of this system is proved and demanded by user as it provides such convenience and easy services with just use voice to transmit the instruction. Hence, this strategy will grab large number of customers and simultaneously will generate huge profit too to the bank institution that applies this system. It is hoping that, by developing this system it will be a platform for next developer to host the system and can be use a large number of customers simultaneously and efficiently. Keyword: Voice based, telephony, combination of programming, architectur

    Applied and Computational Linguistics

    Get PDF
    Розглядається сучасний стан прикладної та комп’ютерної лінгвістики, проаналізовано лінгвістичні теорії 20-го – початку 21-го століть під кутом розмежування різних аспектів мови з метою формалізованого опису у електронних лінгвістичних ресурсах. Запропоновано критичний огляд таких актуальних проблем прикладної (комп’ютерної) лінгвістики як укладання комп’ютерних лексиконів та електронних текстових корпусів, автоматична обробка природної мови, автоматичний синтез та розпізнавання мовлення, машинний переклад, створення інтелектуальних роботів, здатних сприймати інформацію природною мовою. Для студентів та аспірантів гуманітарного профілю, науково-педагогічних працівників вищих навчальних закладів України

    Survey of Gaelic Corpus Technology

    Get PDF
    No abstract available

    Analysing and Preventing Self-Issued Voice Commands

    Get PDF

    Products and Services

    Get PDF
    Today’s global economy offers more opportunities, but is also more complex and competitive than ever before. This fact leads to a wide range of research activity in different fields of interest, especially in the so-called high-tech sectors. This book is a result of widespread research and development activity from many researchers worldwide, covering the aspects of development activities in general, as well as various aspects of the practical application of knowledge

    Fast Speech in Unit Selection Speech Synthesis

    Get PDF
    Moers-Prinz D. Fast Speech in Unit Selection Speech Synthesis. Bielefeld: Universität Bielefeld; 2020.Speech synthesis is part of the everyday life of many people with severe visual disabilities. For those who are reliant on assistive speech technology the possibility to choose a fast speaking rate is reported to be essential. But also expressive speech synthesis and other spoken language interfaces may require an integration of fast speech. Architectures like formant or diphone synthesis are able to produce synthetic speech at fast speech rates, but the generated speech does not sound very natural. Unit selection synthesis systems, however, are capable of delivering more natural output. Nevertheless, fast speech has not been adequately implemented into such systems to date. Thus, the goal of the work presented here was to determine an optimal strategy for modeling fast speech in unit selection speech synthesis to provide potential users with a more natural sounding alternative for fast speech output