903 research outputs found
Porting the galaxy system to Mandarin Chinese
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1997.Includes bibliographical references (leaves 83-86).by Chao Wang.M.S
Recommended from our members
Missed opportunities for negotiating cultural and personal meaning in language classroom : an ethnographic study of Chinese language classes.
There are hidden difficulties in teaching a foreign language in a classroom context that have not been examined. Using ethnographic research methods of participant observation, field notes, audio-taping of classroom conversational exchange, and interviews with participants of the interactions, the hidden issues were identified through data analysis focusing on the discourse between teachers and students of Chinese language. While many classroom interaction studies focus on teaching methods or content that should be taught, this research study examines language classroom interactions from a sociocultural perspective. It provides a description of the cultural and social factors that influence the communicative process in classroom interactions. The underlying assumption guiding this study is that effective foreign language teaching and learning is a communicative process that involves more than simply instruction about the formal features of language and cultural knowledge. The purpose of this process is to develop the individual learner\u27s communicative competence. This competence includes not only language competence and cultural competence but also the openness and readiness of the mind and the flexibility of cognition to function in cross-cultural contexts. The study reveals that a central cause of language classroom miscommunication is the difficulty participants have in creating contextual coherence and meaning. This problem is the direct result of the participants\u27 simplified assumptions of cultural and social stereotypes. The stereotyping of individual and power relationships in the classroom hinders the learning process and can lead to underdeveloped perspectives of cultural images and social roles of individuals. With stereotyped cultural images and the narrowly defined social roles of participants in the classroom, the teaching and learning process limits opportunities to actively develop the learners\u27 communicative competence. The practice of teaching and learning thus may reinforce inflexibility in communicative negotiation and in dealing with the cultural, social, and individual diversities in the cross-cultural interactions outside the classroom. Therefore, cross-cultural openness--the awareness of sociocultural and individual diversity in cross-cultural interactions--is significant in language teaching and learning. The significance of cross-cultural openness is that it not only influences the process of language teaching and learning, but also the content of language teaching and learning
Final FLaReNet deliverable: Language Resources for the Future - The Future of Language Resources
Language Technologies (LT), together with their backbone, Language Resources (LR), provide an essential support to the challenge of Multilingualism and ICT of the future. The main task of language technologies is to bridge language barriers and to help creating a new environment where information flows smoothly across frontiers and languages, no matter the country, and the language, of origin. To achieve this goal, all players involved need to act as a community able to join forces on a set of shared priorities. However, until now the field of Language Resources and Technology has long suffered from an excess of individuality and fragmentation, with a lack of coherence concerning the priorities for the field, the direction to move, not to mention a common timeframe. The context encountered by the FLaReNet project was thus represented by an active field needing a coherence that can only be given by sharing common priorities and endeavours. FLaReNet has contributed to the creation of this coherence by gathering a wide community of experts and making them participate in the definition of an exhaustive set of recommendations
Within China's orbit? China through the eyes of the Australian parliament
As the People’s Republic of China continues to develop as the subject of intense economic, political and cultural interest, this study examines the place ‘China’ has held in the parliamentary imagination by exploring the history of the Australian parliament’s dealings with China. The monograph’s period of historical focus is broad: it begins with an analysis of Federation debates over immigration restriction and concludes with a detailed assessment of the bilateral relationship during the 41st Parliament (November 2004–November 2007).
While the monograph provides extensive coverage of the changing nature of Australia–China relations, it does not attempt a full narrative history of the period with which it is concerned; rather, it offers an analysis of a series of foundational moments in the development of the relationship. Such a methodological approach enables the research to document the profound transformation that has taken place in Australian parliamentary attitudes towards China
Automatic speech recognition for European Portuguese
Dissertação de mestrado em Informatics EngineeringThe process of Automatic Speech Recognition (ASR) opens doors to a vast amount of possible
improvements in customer experience. The use of this type of technology has increased
significantly in recent years, this change being the result of the recent evolution in ASR
systems. The opportunities to use ASR are vast, covering several areas, such as medical,
industrial, business, among others. We must emphasize the use of these voice recognition
systems in telecommunications companies, namely, in the automation of consumer assistance
operators, allowing the service to be routed to specialized operators automatically through
the detection of matters to be dealt with through recognition of the spoken utterances. In
recent years, we have seen big technological breakthrough in ASR, achieving unprecedented
accuracy results that are comparable to humans. We are also seeing a move from what
is known as the Traditional approach of ASR systems, based on Hidden Markov Models
(HMM), to the newer End-to-End ASR systems that obtain benefits from the use of deep
neural networks (DNNs), large amounts of data and process parallelization.
The literature review showed us that the focus of this previous work was almost exclusively
for the English and Chinese languages, with little effort being made in the development of
other languages, as it is the case with Portuguese. In the research carried out, we did not
find a model for the European Portuguese (EP) dialect that is freely available for general
use. Focused on this problem, this work describes the development of a End-to-End ASR
system for EP. To achieve this goal, a set of procedures was followed that allowed us to
present the concepts, characteristics and all the steps inherent to the construction of these
types of systems. Furthermore, since the transcribed speech needed to accomplish our goal
is very limited for EP, we also describe the process of collecting and formatting data from a
variety of different sources, most of them freely available to the public. To further try and
improve our results, a variety of different data augmentation techniques were implemented
and tested. The obtained models are based on a PyTorch implementation of the Deep Speech
2 model.
Our best model achieved an Word Error Rate (WER) of 40.5%, in our main test corpus,
achieving slightly better results to those obtained by commercial systems on the same data.
Around 150 hours of transcribed EP was collected, so that it can be used to train other ASR
systems or models in different areas of investigation. We gathered a series of interesting
results on the use of different batch size values as well as the improvements provided by
the use of a large variety of data augmentation techniques. Nevertheless, the ASR theme is vast and there is still a variety of different methods and interesting concepts that we could
research in order to seek an improvement of the achieved results.O processo de Reconhecimento Automático de Fala (ASR) abre portas para uma grande
quantidade de melhorias possÃveis na experiência do cliente. A utilização deste tipo de
tecnologia tem aumentado significativamente nos últimos anos, sendo esta alteração o
resultado da evolução recente dos sistemas ASR. As oportunidades de utilização do ASR
são vastas, abrangendo diversas áreas, como médica, industrial, empresarial, entre outras.
É
de realçar que a utilização destes sistemas de reconhecimento de voz nas empresas de
telecomunicações, nomeadamente, na automatização dos operadores de atendimento ao
consumidor, permite o encaminhamento automático do serviço para operadores especializados
através da detecção de assuntos a tratar através do reconhecimento de voz. Nos
últimos anos, vimos um grande avanço tecnológico em ASR, alcançando resultados de
precisão sem precedentes que são comparáveis aos atingidos por humanos. Por outro lado,
vemos também uma mudança do que é conhecido como a abordagem tradicional, baseados
em modelos de Markov ocultos (HMM), para sistemas mais recentes ponta-a-ponta que
reúnem benefÃcios do uso de redes neurais profundas, em grandes quantidades de dados e
da paralelização de processos.
A revisão da literatura efetuada mostra que o foco do trabalho anterior foi quase que
exclusivamente para as lÃnguas inglesa e chinesa, com pouco esforço no desenvolvimento de
outras lÃnguas, como é o caso do português. Na pesquisa realizada, não encontramos um
modelo para o dialeto português europeu (PE) que se encontre disponÃvel gratuitamente para
uso geral. Focado neste problema, este trabalho descreve o desenvolvimento de um sistema
de ASR ponta-a-ponta para o PE. Para atingir este objetivo, foi seguido um conjunto de
procedimentos que nos permitiram apresentar os conceitos, caracterÃsticas e todas as etapas
inerentes à construção destes tipos de sistemas. Além disso, como a fala transcrita necessária
para cumprir o nosso objetivo é muito limitada para PE, também descrevemos o processo
de coleta e formatação desses dados em uma variedade de fontes diferentes, a maioria
delas disponÃveis gratuitamente ao público. Para tentar melhorar os nossos resultados, uma
variedade de diferentes técnicas de aumento de dados foram implementadas e testadas. Os
modelos obtidos são baseados numa implementação PyTorch do modelo Deep Speech 2.
O nosso melhor modelo obteve uma taxa de erro de palavras (WER) de 40,5% no nosso
corpus de teste principal, obtendo resultados ligeiramente melhores do que aqueles obtidos
por sistemas comerciais sobre os mesmos dados. Foram coletadas cerca de 150 horas de PE
transcritas, que podem ser utilizadas para treinar outros sistemas ou modelos de ASR em
diferentes áreas de investigação. Reunimos uma série de resultados interessantes sobre o uso de diferentes valores de batch size, bem como as melhorias fornecidas pelo uso de uma
grande variedade de técnicas de data augmentation. O tema ASR é vasto e ainda existe uma
grande variedade de métodos diferentes e conceitos interessantes que podemos investigar
para melhorar os resultados alcançados
- …