251,270 research outputs found

    Assistive Technologies

    Get PDF
    A special place into assistive technologies is taken up by the speech recognition and speech synthesizer, which can be used by many different users, persons with visual, language or mobility disabilities. For many years soft developers have been concerned by speech recognition and text-to-speech because we assist to great changes in informatics area and accessibility is the main condition in the creation of assistive software applications.speech recognition, text-to-speech, assistive technologies, accessibility

    Boosting Speech-to-Text software potential

    Get PDF
    The article focuses on finding ways of boosting efficiency and accuracy of Speech-to-Text (STT)-powered input. The effort is triggered by the growing popularity of the software among professional translators, which is in line with the general trend of abandoning typing in favor of speech-to-text application

    End-to-end speech translation system with attention-based mechanisms

    Get PDF
    Speech Recognition and Text-to-Text Translation systems have been improving significantly in recent decades thanks to the improvement of both hardware and software means. However, Speech Translation is usually done as a concatenation of speech recognition and Machine Translation technologies. This project an End-To-End Speech Translation system is built using recent deep learning algorithms.Los sistemas de reconocimiento de voz y traducción de texto a texto han mejorado significativamente en las últimas décadas gracias a la mejora de los medios de hardware y software. Sin embargo, la traducción de voz a texto generalmente se realiza como una concatenación de métodos de reconocimiento de voz y traducción. En este proyecto se construye un sistema de traducción de voz end-to-end usando algoritmos recientes de aprendizaje profundo.Els sistemes de reconeixement de veu i traducció de text a text han millorat significativament en les últimes dècades gràcies a la millora dels mitjans de hardware i software. No obstant això, la traducció de veu a text generalment es realitza com una concatenació de mètodes de reconeixement de veu i traducció. En aquest projecte es construeix un sistema de traducció de veu end-to-end utilitzant algoritmes recents d'aprenentatge profund

    Real-Time Statistical Speech Translation

    Full text link
    This research investigates the Statistical Machine Translation approaches to translate speech in real time automatically. Such systems can be used in a pipeline with speech recognition and synthesis software in order to produce a real-time voice communication system between foreigners. We obtained three main data sets from spoken proceedings that represent three different types of human speech. TED, Europarl, and OPUS parallel text corpora were used as the basis for training of language models, for developmental tuning and testing of the translation system. We also conducted experiments involving part of speech tagging, compound splitting, linear language model interpolation, TrueCasing and morphosyntactic analysis. We evaluated the effects of variety of data preparations on the translation results using the BLEU, NIST, METEOR and TER metrics and tried to give answer which metric is most suitable for PL-EN language pair.Comment: machine translation, polish englis

    Designs that talk and listen: Integrating functional information using voice-enabled CAD systems

    Get PDF
    In this paper, we report a novel approach to voice-enabled design synthesis and management. This approach consists of integrating “text-to-speech” and “speech-to-text” tools within a software prototype developed in MATLAB that connects seamlessly to a computer aided design software via an application program interface. Users can train the software to recognize specific words spoken by them and subsequently query for functions of individual parts or subassemblies. The results indicate the feasibility of interactive voice enabled product lifecycle management (PLM) tools in the future, where designing on the fly can be increasingly supported

    The effects of word prediction and text-to-speech on the writing process of translating

    Get PDF
    The purpose of this study was to determine the effects of the combination of word prediction and text-to-speech software on the writing process of translating. Participants for this study included 10 elementary and middle school students who had a diagnosis of disorder of written expression. A modified multiple case series was used to collect data over a three-week period. The participants were asked to describe in writing what was happening in a picture-based writing prompt. The participants responded in writing using a word processor software alone in the pretest condition and using a word processor in conjunction with the word prediction, text-to-speech software WordQ in the posttest condition. The results provide support for the use of the word prediction and text-to-speech features in WordQ with students who have a diagnosis of disorder of written expression. The participants’ written work was significant for having fewer spelling errors, increased syntactic maturity as measured by mean T-unit length and fewer overall words produced when using the WordQ software

    IMPLEMENTASI PERANGKAT LUNAK SERVER TEXT TO SPEECH BAHASA INDONESIA DENGAN UNIT UCAPAN DIPHONE Software Implementation of Indonesian Text to Speech Server with Diphone Pronounciation Unit

    Get PDF
    Text to speech merupakan salah satu aplikasi Voice Processing System(VPS) yang mengkonversikan suatu masukan (input) yang berupa huruf dan karakter tulisan (teks) menjadi keluaran (output) yang berupa sinyal suara (speech). Selama ini aplikasi text to speech digunakan sebagai aplikasi desktop pada sebuah Personal Computer (PC). Namun seiring dengan perkembangan teknologi informasi, semakin banyak sistem yang berbasis jaringan komputer membutuhkan aplikasi text to speech di dalamnya. Pada tugas akhir ini telah dibuat suatu perangkat lunak yang mengimplementasikan server text to speech bahasa Indonesia, dengan diphone sebagai unit ucapannya. Masukan yang berupa teks diberikan oleh aplikasi client yang terhubung ke aplikasi server melalui jaringan komputer. Masukan tersebut akan diproses oleh aplikasi server menjadi keluaran berupa sebuah file suara yang dikirimkan kembali ke aplikasi client yang mengirim. text to speech , Voice Processing System (VPS), diphone, server, clien

    Challenges in creating speech recognition for endangered language CALL: A Chickasaw case study

    Get PDF
    Speech recognition technology is increasingly becoming an important component of Computer Assisted Language Learning (CALL) software, as well as of a language’s digital vitality. CALL software that integrates speech recognition allows learners to practice oral skills without live instruction and receive feedback on pronunciation. This speech recognition technology may be particularly beneficial for endangered or under-resourced languages. Chickasaw is an indigenous language of North America now spoken mainly in the state of Oklahoma. It is estimated that there are fewer than 75 native speakers of the language remaining, though recent years have seen a surge of interest in Chickasaw culture and language revitalization. In 2007, the Chickasaw Nation launched a robust and multifaceted revitalization program, and in 2015 they commissioned CALL software that integrates speech recognition. However, creating a quality automatic speech recognition (ASR) system necessitates a number of resources that are not always readily available for endangered languages like Chickasaw. Modern speech recognition technology is based on large-scale statistical modeling of target language text and hand transcribed audio corpora. Such technology also assumes a single standardized phonetic orthography where speech can be directly mapped to text. Currently, most available resources for building speech recognition technology are based on languages where researchers are able to access a large pool of literate native speakers who are willing and able to record many hours of high quality audio, and where large volumes of accessible text already exist. For many endangered languages, these criteria cannot easily be fulfilled. This paper is focused on identifying the dimensions of resource challenges that affect building corpora for such languages, using Chickasaw as a case study. Furthermore, we identify techniques that we have used to create a corpus of speech data suitable for building an instructional speech recognition module for use in CALL software

    Development of a Yoruba Text-to-Speech System Using Festival

    Get PDF
    This paper presents a Text-to-Speech (TTS) synthesis system for Yorúbà language using the open-source Festival TTS engine. Yorúbà being a resource scarce language like most African languages however presents a major challenge to conventional speech synthesis approaches, which typically require large corpora for the training of such system. Speech data were recorded in a quiet environment with a noise cancelling microphone on a typical multimedia computer system using the Speech Filing System software (SFS), analysed and annotated using PRAAT speech processing software. Evaluation of the system was done using the intelligibility and naturalness metrics through mean opinion score. The result shows that the level of intelligibility and naturalness of the system on word-level is 55.56% and 50% respectively, but the system performs poorly for both intelligibility and naturalness test on sentence level. Hence, there is a need for further research to improve the quality of the synthesized speech. Keywords: Text-to-Speech, Festival, Yorúbà, Syllabl
    corecore