6,544 research outputs found

    Perceptual optimization of unit-selection text-to-speech synthesis systems by means of active interactive genetic algorithms

    Get PDF
    The tuning process of Unit Selection TTS (US-TTS) system is usually performed by an expert that typically conducts the task of weighting the cost function by hand. However, hand tuning is costly in terms of the required training time and inaccurate and ambiguous in terms of methodology. With the purpose of easing the task of properly tuning the weights of the cost function, this thesis make its contribution from a perceptual-based approach using of active interactive Genetic Algorithms (aiGAs). The thesis pursues four major guidelines: i) accuracy when tuning the weights, ii) robustness of the obtained weights, iii)real world applicability of the methodology to any cost function design, and iv)finding consensus of the different users when tuning the weights. The experimentation is carried out through a small and medium sized corpus (1.9h) applied to different configurations (type of features) of the US-TTS cost function. The thesis concludes that aiGAs are highly competitive in comparison to other weight tuning techniques from the state-of-the-artPeer ReviewedPostprint (published version

    Advances in Robotics, Automation and Control

    Get PDF
    The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man

    Automated Testing of Speech-to-Speech Machine Translation in Telecom Networks

    Get PDF
    Globalisoituvassa maailmassa kyky kommunikoida kielimuurien yli käy yhä tärkeämmäksi. Kielten opiskelu on työlästä ja siksi halutaan kehittää automaattisia konekäännösjärjestelmiä. Ericsson on kehittänyt prototyypin nimeltä Real-Time Interpretation System (RTIS), joka toimii mobiiliverkossa ja kääntää matkailuun liittyviä fraaseja puhemuodossa kahden kielen välillä. Nykyisten konekäännösjärjestelmien suorituskyky on suhteellisen huono ja siksi testauksella on suuri merkitys järjestelmien suunnittelussa. Testauksen tarkoituksena on varmistaa, että järjestelmä säilyttää käännösekvivalenssin sekä puhekäännösjärjestelmän tapauksessa myös riittävän puheenlaadun. Luotettavimmin testaus voidaan suorittaa ihmisten antamiin arviointeihin perustuen, mutta tällaisen testauksen kustannukset ovat suuria ja tulokset subjektiivisia. Tässä työssä suunniteltiin ja analysoitiin automatisoitu testiympäristö Real-Time Interpretation System -käännösprototyypille. Tavoitteina oli tutkia, voidaanko testaus suorittaa automatisoidusti ja pystytäänkö todellinen, käyttäjän havaitsema käännösten laatu mittaamaan automatisoidun testauksen keinoin. Tulokset osoittavat että mobiiliverkoissa puheenlaadun testaukseen käytetyt menetelmät eivät ole optimaalisesti sovellettavissa konekäännösten testaukseen. Nykytuntemuksen mukaan ihmisten suorittama arviointi on ainoa luotettava tapa mitata käännösekvivalenssia ja puheen ymmärrettävyyttä. Konekäännösten testauksen automatisointi vaatii lisää tutkimusta, jota ennen subjektiivinen arviointi tulisi säilyttää ensisijaisena testausmenetelmänä RTIS-testauksessa.In the globalizing world, the ability to communicate over language barriers is increasingly important. Learning languages is laborious, which is why there is a strong desire to develop automatic machine translation applications. Ericsson has developed a speech-to-speech translation prototype called the Real-Time Interpretation System (RTIS). The service runs in a mobile network and translates travel phrases between two languages in speech format. The state-of-the-art machine translation systems suffer from a relatively poor performance and therefore evaluation plays a big role in machine translation development. The purpose of evaluation is to ensure the system preserves the translational equivalence, and in case of a speech-to-speech system, the speech quality. The evaluation is most reliably done by human judges. However, human-conducted evaluation is costly and subjective. In this thesis, a test environment for Ericsson Real-Time Interpretation System prototype is designed and analyzed. The goals are to investigate if the RTIS verification can be conducted automatically, and if the test environment can truthfully measure the end-to-end performance of the system. The results conclude that methods used in end-to-end speech quality verification in mobile networks can not be optimally adapted for machine translation evaluation. With current knowledge, human-conducted evaluation is the only method that can truthfully measure translational equivalence and the speech intelligibility. Automating machine translation evaluation needs further research, until which human-conducted evaluation should remain the preferred method in RTIS verification

    Mobility and Aging: Older Drivers’ Visual Searching, Lane Keeping and Coordination

    Get PDF
    This thesis examined older drivers’ mobility and behaviour through comprehensive measurements of driver-vehicle-environment interaction and investigated the associations between driving behaviour and cognitive functions. Data were collected and analysed for 50 older drivers using eye tracking, GNSS tracking, and GIS. Results showed that poor selective attention, spatial ability and executive function in older drivers adversely affect lane keeping, visual search and coordination. Visual-motor coordination measure is sensitive and effective for driving assessment in older drivers

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
    corecore