Search CORE

2,313 research outputs found

Neologisms in Modern English: study of word-formation processes

Author: Gontšarova Julia
Publication venue: Tartu Ülikooli Narva Kolledž
Publication date: 01/01/2013
Field of study

http://tartu.ester.ee/record=b2654513~S1*es

DSpace at Tartu University Library

Multilingual Text to Speech in embedded systems using RC8660

Author: McMeekin David
Murray Iain
Nazemi Azadeh
Publication venue: 'Research India Publications'
Publication date: 01/01/2014
Field of study

Most multilingual Test to Speech (TTS) systems are software applications which allow people with visual impairments or reading disabilities to listen the written material using computer. This paper describes an approach to make a multilingual TTS and embed it into the portable, low cost, and standalone embedded system to access and read electronic documents particularly in developing countries. There are several TTS such as Doubletalk, DECtalk, and Dolphin available in market, also there are some products using TTS such as Talking OCR, Bill Reader and Intel Reader, which are not affordable or multilingual. To design this system OMAP3530 an application processor board is considered as the hardware platform to process the language-independent parts of the application and RC8660 used as an integrated TTS processor

espace@Curtin

Use of Weighted Finite State Transducers in Part of Speech Tagging

Author: Radev Dragomir R.
Tzoukermann Evelyne
Publication venue
Publication date: 01/01/1997
Field of study

This paper addresses issues in part of speech disambiguation using finite-state transducers and presents two main contributions to the field. One of them is the use of finite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted finite-state transducers. Another contribution is the successful combination of techniques -- linguistic and statistical -- for word disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac

arXiv.org e-Print Archive

CiteSeerX

On the Fourth Edition of A Dictionary of South African English

Author: Gold David L.
Publication venue: 'African Journals Online (AJOL)'
Publication date: 01/11/2016
Field of study

Having reviewed the first and third editions of A Dictionary of South African English in earlier publications, the author examines the fourth edition. He suggests a number of improvements with respect to several aspects of the dictionary, ranging from superficial, though important, matters (like layout and typography) to the most difficult aspects of lexicography (definition and etymology).Keywords: abbreviation, acronym, afrikaans, american english, black english, british english, consultant vs. researcher, contrastive linguistics, corpus-delimitation, dictionary, dutch, english, entry head, etymology, graphic illustration, hebrew, jewish english, judezmo, letterword, lexicography, new netherland dutch, northeastern yiddish, prescriptivism, south african english, western yiddish, yiddis

AJOL - African Journals Online

Brand names of Portuguese medication: understanding the importance of their linguistic structure and regulatory issues

Author: Afonso Cavaco
Aguiar J
Aronson JK
Balota DA
Blatt CR
Bruning MC
Carla Pires
Castelo A
Cavaco A
Cooper N
Ferrand L
Frota S
Gonçalves C
Handler SM
Hyönä J
Justi FR
Kenagy JW
Lambert BL
Lambert BL
Lambert BL
Marian V
Marina Vigário
Martins F
Mateus MH
Mateus MH
Mezzomo CL
Nespor M
New B
Nobre PF
Pereira TA
Rohrmeier M
Soto-Faracoa S
Trancoso I
Ulrich B
Viana FL
Vigário M
Vigário M
Yap MJ
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

The Production of Speech Corpora

Author: Baumann Angela
Draxler Christoph
Ellbogen Tania
Schiel Florian
Steffen Alexander
Publication venue
Publication date: 21/03/2012
Field of study

Open Access LMU

Modelo acústico de língua inglesa falada por portugueses

Author: Simões Carla Alexandra Coelho
Publication venue
Publication date: 01/01/2007
Field of study

Trabalho de projecto de mestrado em Engenharia Informática, apresentado à Universidade de Lisboa, através da Faculdade de Ciências, 2007No contexto do reconhecimento robusto de fala baseado em modelos de Markov não observáveis (do inglês Hidden Markov Models - HMMs) este trabalho descreve algumas metodologias e experiências tendo em vista o reconhecimento de oradores estrangeiros. Quando falamos em Reconhecimento de Fala falamos obrigatoriamente em Modelos Acústicos também. Os modelos acústicos reflectem a maneira como pronunciamos/articulamos uma língua, modelando a sequência de sons emitidos aquando da fala. Essa modelação assenta em segmentos de fala mínimos, os fones, para os quais existe um conjunto de símbolos/alfabetos que representam a sua pronunciação. É no campo da fonética articulatória e acústica que se estuda a representação desses símbolos, sua articulação e pronunciação. Conseguimos descrever palavras analisando as unidades que as constituem, os fones. Um reconhecedor de fala interpreta o sinal de entrada, a fala, como uma sequência de símbolos codificados. Para isso, o sinal é fragmentado em observações de sensivelmente 10 milissegundos cada, reduzindo assim o factor de análise ao intervalo de tempo onde as características de um segmento de som não variam. Os modelos acústicos dão-nos uma noção sobre a probabilidade de uma determinada observação corresponder a uma determinada entidade. É, portanto, através de modelos sobre as entidades do vocabulário a reconhecer que é possível voltar a juntar esses fragmentos de som. Os modelos desenvolvidos neste trabalho são baseados em HMMs. Chamam-se assim por se fundamentarem nas cadeias de Markov (1856 - 1922): sequências de estados onde cada estado é condicionado pelo seu anterior. Localizando esta abordagem no nosso domínio, há que construir um conjunto de modelos - um para cada classe de sons a reconhecer - que serão treinados por dados de treino. Os dados são ficheiros áudio e respectivas transcrições (ao nível da palavra) de modo a que seja possível decompor essa transcrição em fones e alinhá-la a cada som do ficheiro áudio correspondente. Usando um modelo de estados, onde cada estado representa uma observação ou segmento de fala descrita, os dados vão-se reagrupando de maneira a criar modelos estatísticos, cada vez mais fidedignos, que consistam em representações das entidades da fala de uma determinada língua. O reconhecimento por parte de oradores estrangeiros com pronuncias diferentes da língua para qual o reconhecedor foi concebido, pode ser um grande problema para precisão de um reconhecedor. Esta variação pode ser ainda mais problemática que a variação dialectal de uma determinada língua, isto porque depende do conhecimento que cada orador têm relativamente à língua estrangeira. Usando para uma pequena quantidade áudio de oradores estrangeiros para o treino de novos modelos acústicos, foram efectuadas diversas experiências usando corpora de Portugueses a falar Inglês, de Português Europeu e de Inglês. Inicialmente foi explorado o comportamento, separadamente, dos modelos de Ingleses nativos e Portugueses nativos, quando testados com os corpora de teste (teste com nativos e teste com não nativos). De seguida foi treinado um outro modelo usando em simultâneo como corpus de treino, o áudio de Portugueses a falar Inglês e o de Ingleses nativos. Uma outra experiência levada a cabo teve em conta o uso de técnicas de adaptação, tal como a técnica MLLR, do inglês Maximum Likelihood Linear Regression. Esta última permite a adaptação de uma determinada característica do orador, neste caso o sotaque estrangeiro, a um determinado modelo inicial. Com uma pequena quantidade de dados representando a característica que se quer modelar, esta técnica calcula um conjunto de transformações que serão aplicadas ao modelo que se quer adaptar. Foi também explorado o campo da modelação fonética onde estudou-se como é que o orador estrangeiro pronuncia a língua estrangeira, neste caso um Português a falar Inglês. Este estudo foi feito com a ajuda de um linguista, o qual definiu um conjunto de fones, resultado do mapeamento do inventário de fones do Inglês para o Português, que representam o Inglês falado por Portugueses de um determinado grupo de prestígio. Dada a grande variabilidade de pronúncias teve de se definir este grupo tendo em conta o nível de literacia dos oradores. Este estudo foi posteriormente usado na criação de um novo modelo treinado com os corpora de Portugueses a falar Inglês e de Portugueses nativos. Desta forma representamos um reconhecedor de Português nativo onde o reconhecimento de termos ingleses é possível. Tendo em conta a temática do reconhecimento de fala este projecto focou também a recolha de corpora para português europeu e a compilação de um léxico de Português europeu. Na área de aquisição de corpora o autor esteve envolvido na extracção e preparação dos dados de fala telefónica, para posterior treino de novos modelos acústicos de português europeu. Para compilação do léxico de português europeu usou-se um método incremental semi-automático. Este método consistiu em gerar automaticamente a pronunciação de grupos de 10 mil palavras, sendo cada grupo revisto e corrigido por um linguista. Cada grupo de palavras revistas era posteriormente usado para melhorar as regras de geração automática de pronunciações.The tremendous growth of technology has increased the need of integration of spoken language technologies into our daily applications, providing an easy and natural access to information. These applications are of different nature with different user’s interfaces. Besides voice enabled Internet portals or tourist information systems, automatic speech recognition systems can be used in home user’s experiences where TV and other appliances could be voice controlled, discarding keyboards or mouse interfaces, or in mobile phones and palm-sized computers for a hands-free and eyes-free manipulation. The development of these systems causes several known difficulties. One of them concerns the recognizer accuracy on dealing with non-native speakers with different phonetic pronunciations of a given language. The non-native accent can be more problematic than a dialect variation on the language. This mismatch depends on the individual speaking proficiency and speaker’s mother tongue. Consequently, when the speaker’s native language is not the same as the one that was used to train the recognizer, there is a considerable loss in recognition performance. In this thesis, we examine the problem of non-native speech in a speaker-independent and large-vocabulary recognizer in which a small amount of non-native data was used for training. Several experiments were performed using Hidden Markov models, trained with speech corpora containing European Portuguese native speakers, English native speakers and English spoken by European Portuguese native speakers. Initially it was explored the behaviour of an English native model and non-native English speakers’ model. Then using different corpus weights for the English native speakers and English spoken by Portuguese speakers it was trained a model as a pool of accents. Through adaptation techniques it was used the Maximum Likelihood Linear Regression method. It was also explored how European Portuguese speakers pronounce English language studying the correspondences between the phone sets of the foreign and target languages. The result was a new phone set, consequence of the mapping between the English and the Portuguese phone sets. Then a new model was trained with English Spoken by Portuguese speakers’ data and Portuguese native data. Concerning the speech recognition subject this work has other two purposes: collecting Portuguese corpora and supporting the compilation of a Portuguese lexicon, adopting some methods and algorithms to generate automatic phonetic pronunciations. The collected corpora was processed in order to train acoustic models to be used in the Exchange 2007 domain, namely in Outlook Voice Access

Universidade de Lisboa: Repositório.UL

English as a Lingua Franca: Its use in the Field of Medicine

Author: CARRARO MARTINA
Publication venue
Publication date: 15/12/2022
Field of study

openThe aim of this dissertation is to demonstrate how increasing contact between different communities has led more and more people to use the English language as a Bridge language, enabling intelligible communication even among different boundaries. As a result of this phenomenon, nowadays the English language serves as a Lingua Franca in many countries around the world, playing a dominant role. ELF is also the main means of communication in various fields such as technology, economics, international politics but also science. One of this thesis’ focuses is the use of English as a Lingua Franca in the medical science (MELF). English dominates above all in almost all medical journals and medical registers, where it is considered a fundamental tool in order not to be considered less prestigious. Equally, it is widely used in oral communication at congresses, meetings or conferences as the language of international experts, but also between professionals or doctors in practice. In fact, the relationship between ELF and MELF will be analysed from a linguistic point of view, during communication in healthcare settings, involving the opinion of the students of Medicine in English at the University of Padua

Padua Thesis and Dissertation Archive

Speech synthesis : Developing a web application implementing speech technology

Author: Gebremariam Gudeta
Publication venue: Metropolia Ammattikorkeakoulu
Publication date: 01/01/2016
Field of study

Speech is a natural media of communication for humans. Text-to-speech (TTS) technology uses a computer to synthesize speech. There are three main techniques of TTS synthesis. These are formant-based, articulatory and concatenative. The application areas of TTS include accessibility, education, entertainment and communication aid in mass transit. A web application was developed to demonstrate the application of speech synthesis technology. Existing speech synthesis engines for the Finnish language were compared and two open source text to speech engines, Festival and Espeak were selected to be used with the web application. The application uses a Linux-based speech server which communicates with client devices with the HTTP-GET protocol. The application development successfully demonstrated the use of speech synthesis in language learning. One of the emerging sectors of speech technologies is the mobile market due to limited input capabilities in mobile devices. Speech technologies are not equally available in all languages. Text in the Oromo language was tested using Finnish speech synthesizers; due to similar rules in orthography of germination of consonants and length of vowels, legible results were gained

Theseus

7 kingdoms of the Litvaks

Author: Katz Dovid
Publication venue
Publication date: 17/11/2009
Field of study

Hochschulschriftenserver - Universität Frankfurt am Main