Vysoké učení technické v Brně. Fakulta informačních technologií
Abstract
Most current automatic speaker recognition system extract speaker-depend features by looking at short-term spectral information. This approach ignores long-term information. I explored approach that use the fundamental frequency and energy trajectories for each speaker. This approach models prosody dynamics on single fonemes or syllables. It is known from literature that prosodic systems do not work as well the acoustic one but it improve the system when fusing. I verified this assumption by fusing my results with state of the art acoustic system from BUT. Data from standard evaluation campaigns organized by National Institute of Standarts and Technology are used for all experiments
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.