2 research outputs found
Expectation-Maximization for Speech Source Separation using Convolutive Transfer Function
International audienceThis paper addresses the problem of under-determinded speech source separation from multichannel microphone singals, i.e. the convolutive mixtures of multiple sources. The time-domain signals are first transformed to the short-time Fourier transform (STFT) domain. To represent the room filters in the STFT domain, instead of the widely-used narrowband assumption, we propose to use a more accurate model, i.e. the convolutive transfer function (CTF). At each frequency band, the CTF coefficients of the mixing filters and the STFT coefficients of the sources are jointly estimated by maximizing the likelihood of the microphone signals, which is resolved by an Expectation-Maximization (EM) algorithm. Experiments show that the proposed method provides very satisfactory performance under highly reverberant environment
深局åŠç¿ã«åºã¥ãé³æºæ å ±æšå®ã®ããã®ç¢ºçè«çç®çé¢æ°ã®ç 究
ãæ¬ç 究ã¯ïŒãã€ã¯ããã³ã§èŠ³æž¬ããé³é¿ä¿¡å·ããïŒæºä¿¡å·ãé³æºã®çš®é¡ãç¶æ
ãªã©ã®é³ã«é¢ä¿ããæ
å ±ã§ãããé³æºæ
å ±ããæšå®ããç 究ã§ããïŒé³æºæ
å ±æšå®ã®é¡æãšããŠïŒæºä¿¡å·ãšéé³ãéç³ãã芳枬信å·ããæºä¿¡å·ãæšå®ãããé³æºåŒ·èª¿ããšïŒèŠ³æž¬ä¿¡å·ã«å«ãŸããç°å¢é³ã®çš®é¡ãç¶æ
ãæšå®ããŠåšå²ã®å±éºãäºæž¬/å¯ç¥ãããç°åžžé³æ€ç¥ãã«çŠç¹ãåœãŠãïŒé³æºã®çš®é¡ãç¶æ
ãªã©ã®æœåšçãªé³æºæ
å ±ãèæ
®ããªããé³æºåŒ·èª¿ãã§ããã°ïŒå€§æ声ã«å
ãŸãããµãã«ãŒã¹ã¿ãžã¢ã ã§ïŒç¹å®ã®éžæã®å£°ãããŒã«ã®ããã¯é³ãæšå®ã§ãïŒãŸãã§ãµãã«ãŒã¹ã¿ãžã¢ã ã«æœã蟌ãã ãããªã³ã³ãã³ãèŠèŽã®æ¹æ³ããŠãŒã¶ã«æäŸå¯èœã«ãªãïŒèŠ³æž¬ä¿¡å·ã«å«ãŸããç°å¢é³ã®çš®é¡ãç¶æ
ãæšå®ããç°åžžé³æ€ç¥ãå®çŸããã°ïŒæ©åšã®åäœé³ããïŒãã®æ©åšã®åäœãæ£åžžãç°åžžãïŒç¶æ
ïŒãæšå®ã§ããããã«ãªãïŒè£œé /ä¿å®æ¥åã®å¹çåãã§ããïŒãé³æºæ
å ±ãæšå®ããããã®ææ³ãšããŠïŒçµ±èšçæ©æ¢°åŠç¿ã«åºã¥ãã¢ãããŒããç 究ãããŠããïŒè¿å¹Žã§ã¯æ·±å±€åŠç¿ãé³æºæ
å ±æšå®ã«é©çšããããšã§ïŒãã®æšå®ç²ŸåºŠã倧ããåäžããŠããïŒæ·±å±€åŠç¿ã«åºã¥ãé³æºæ
å ±æšå®ã§ã¯ïŒãã¥ãŒã©ã«ãããã¯ãŒã¯ã芳枬信å·ããææã®é³æºæ
å ±ãžã®éç·åœ¢ååé¢æ°ãšããŠçšããïŒãããŠãã¥ãŒã©ã«ãããã¯ãŒã¯ãé³æºæ
å ±ã®æšå®ç²ŸåºŠãè©äŸ¡ãããç®çé¢æ°ãã®å€ãæ倧å/æå°åããããã«æ±ããïŒå€ãã®æ·±å±€åŠç¿ã«ãããŠç®çé¢æ°ã«ã¯ïŒäºä¹èª€å·®é¢æ°ã亀差ãšã³ããããŒé¢æ°ãªã©ã®æ±ºå®è«çãªç®çé¢æ°ãçšããããïŒãé³æºæ
å ±æšå®ã«ãããŠç®çé¢æ°ã®èšèšãšã¯ïŒææã®é³æºæ
å ±ã®æ§è³ªãæšå®ç²ŸåºŠãå®çŸ©ããããšãšç䟡ã§ããïŒé³æºæ
å ±ã®äžã¯ïŒæ±ºå®è«çãªç®çé¢æ°ã§ã¯é³æºæ
å ±ã®æ§è³ªãæšå®ç²ŸåºŠãå®çŸ©ã§ããªããã®ãïŒãããã¯å®çŸ©ããããšã劥åœã§ã¯ãªããã®ãååšããïŒäŸãã°ïŒäººéã®äž»èŠ³çãªé³è³ªè©äŸ¡ãæ倧åããæºä¿¡å·ãïŒç°åžžé³ïŒã©ãã«ããŒã¿ïŒãåéã§ããªãé³æºã®ç¶æ
ã®æšå®ã®ããã®ç®çé¢æ°ã«ã¯ïŒæ±ºå®è«çãªç®çé¢æ°ã¯æ¡çšã§ããªãïŒãã®åé¡ã解決ããããã«ã¯ïŒãããã¯ãŒã¯ã®æ§é ã ãã§ãªãïŒãã¥ãŒã©ã«ãããã¯ãŒã¯ã®åŠç¿ã«çšããç®çé¢æ°ãé«åºŠåããªããŠã¯ãªããªãïŒãæ¬ç 究ã§ã¯ïŒæ±ºå®è«çãªé¢æ°ã§ç®çé¢æ°ãèšèšã§ããªãé³æºæ
å ±ãæšå®ããããã«ïŒæ·±å±€åŠç¿ã«åºã¥ãé³æºæ
å ±æšå®ã®ããã®ç®çé¢æ°ã®ç 究ãè¡ãïŒææã®é³æºæ
å ±ã®æ§è³ªãæšå®ç²ŸåºŠãïŒæšå®ãããé³æºæ
å ±ã®ç¹æ§ã解ãããåé¡ã«å¿ããŠå
¥åºåå€ããšãã¹ãå€ã®ç¢ºçååžãéåãšããŠå®çŸ©ãïŒãã¥ãŒã©ã«ãããã¯ãŒã¯ã®å
¥åºåãæºããã¹ãçµ±èšçãªæ§è³ªãç®çé¢æ°ãšããŠèšè¿°ãããšããçæ³ãããã®åé¡ã«åãçµãïŒã3 ç« ã§ã¯ïŒã¹ããŒãã®ç«¶æé³ãªã©ïŒã©ãã«ããŒã¿ãååã«ååšããªãæºä¿¡å·ã匷調ããããã®ææ³ãææ¡ããïŒå°éã®åŠç¿ããŒã¿ã§ãã¥ãŒã©ã«ãããã¯ãŒã¯ãåŠç¿ããããã«ã¯ïŒäºåã«èšèš/éžæããé³é¿ç¹åŸŽéã芳枬信å·ããæœåºãïŒå°èŠæš¡ãªãã¥ãŒã©ã«ãããã¯ãŒã¯ã§é³æºåŒ·èª¿ãè¡ãå¿
èŠãããïŒ3 ç« ã§ã¯ïŒææã®é³æºã匷調ããããã®é©åãªé³é¿ç¹åŸŽéãïŒçžäºæ
å ±éæ倧åã«åºã¥ãéžæããæ¹æ³ãæ€èšããïŒãã®éïŒç¹åŸŽéåè£ã®æ¬¡å
æ°ã倧ããé³é¿ç¹åŸŽééžæã«çžäºæ
å ±éãæ£ç¢ºã«èšç®ãã "ã«ãŒãã«æ¬¡å
å§çž®æ³" ãé©çšããããšãèãïŒã¹ããŒã¹æ£ååæ³ã«åºã¥ã埮åå¯èœãªç®çé¢æ°ãå°åºãïŒå€§éãªé³é¿ç¹åŸŽéåè£ããé©åãªé³é¿ç¹åŸŽéãåŸé
æ³ã«ããéžæã§ããé³é¿ç¹åŸŽééžææ³ãææ¡ããïŒå®éè©äŸ¡è©Šéšã§ã¯ïŒåŸæ¥ã®é³é¿ç¹åŸŽééžææ³ãšæ¯ã¹SDR ãåäžããããšã瀺ãïŒãŸã䞻芳è©äŸ¡è©Šéšã§ã¯ïŒææ¡æ³ãçšããŠé³é¿ç¹åŸŽéãéžæããããšã§åŸæ¥æ³ãšæ¯ã¹æºä¿¡å·ã®æçæ§ãåäžããããšã瀺ããïŒãã®ææã«ããïŒãããŸã§æšå®ãå°é£ãšãããŠããïŒåŠç¿ããŒã¿ãååã«åŸãããªããããªæºä¿¡å·ãïŒãããŸã§æºä¿¡å·ã®æšå®å¯Ÿè±¡ãšãããŠããïŒé©åãªé³é¿ç¹åŸŽéãæªç¥ãªæºä¿¡å·ãæšå®ã§ããããã«ãªã£ãïŒã4 ç« ã§ã¯ïŒé³æºåŒ·èª¿ã®åºåé³ã®äž»èŠ³å質ãåäžãããããã«ïŒã©ãã«ããŒã¿ãäžæã«å®ããããšãã§ããïŒäºä¹èª€å·®ãªã©ã®ç®çé¢æ°ã§æšå®ç²ŸåºŠãå®çŸ©ããããšã劥åœã§ãªãæºä¿¡å·ã匷調ããããã®ææ³ãææ¡ããïŒåŸæ¥ã®æ·±å±€åŠç¿ã«åºã¥ãé³æºåŒ·èª¿ã§ã¯ïŒæºä¿¡å·ã®æ¯å¹
ã¹ãã¯ãã«ãªã©ãã©ãã«ããŒã¿ãšãïŒãã¥ãŒã©ã«ãããã¯ãŒã¯ã®åºåãšã©ãã«ããŒã¿ã®äºä¹èª€å·®ãæå°åããããã«åŠç¿ãããŠããïŒãã®ããïŒåºåé³ã«æªãçããŠäž»èŠ³å質ãäœäžãããšããåé¡ããã£ãïŒããã§4 ç« ã§ã¯ïŒã©ãã«ããŒã¿ãçšæãã代ããã«äž»èŠ³è©äŸ¡å€ãšçžé¢ã®é«ãé³è³ªè©äŸ¡å€ïŒèŽæè©ç¹ïŒãæ倧åããããããã®ç®çé¢æ°ãææ¡ããïŒå®éè©äŸ¡è©Šéšã§ã¯ïŒææ¡ããç®çé¢æ°ãå©çšããããšã§ïŒèŽæè©ç¹ãæ倧åããããã«ãã¥ãŒã©ã«ãããã¯ãŒã¯ãåŠç¿ã§ããããšã確èªããïŒãŸã䞻芳è©äŸ¡è©Šéšã§ã¯ïŒææ¡æ³ã¯åŸæ¥ã®äºä¹èª€å·®æå°åã«åºã¥ãç®çé¢æ°ãå©çšããé³æºåŒ·èª¿ãããé«ã䞻芳å質ã§é³æºåŒ·èª¿ã§ããããšã瀺ããïŒãã®ææã«ããïŒãããŸã§é³æºåŒ·èª¿ã®åŠç¿ã«å©çšã§ããªãã£ãèŽæè©ç¹ã人éã®è©äŸ¡ãªã©ã®ïŒãã\é«æ¬¡" ãªè©äŸ¡å°ºåºŠãç®çé¢æ°ãšããŠå©çšã§ããããã«ãªãïŒãã¥ãŒã©ã«ãããã¯ãŒã¯ãçšããé³æºåŒ·èª¿ã®å¿çšç¯å²ãåºããããšãã§ããïŒã5 ç« ã§ã¯ïŒã¢ãŒã¿ãŒã®ç°åžžå転é³ããã¢ãªã³ã°ã®ã¶ã€ããé³ãªã©ã®æ®æ®µçºçããªãé³ïŒç°åžžé³ïŒãæ€ç¥ãïŒæ©åšåäœã®ç¶æ
ãæ£åžžãç°åžžããå€å®ããããšã§æ©åšã®æ
éãæ€ç¥ãããç°åžžé³æ€ç¥ãã®å®çŸãç®æãïŒãã®åé¡ã®é£ããã¯ïŒæ©åšã®æ
éé »åºŠãããããŠäœãããïŒæ©åšã®ç°åžžåäœé³ïŒã©ãã«ããŒã¿ïŒãåéã§ããïŒäžè¬çãªèå¥ã®ããã®ãã¥ãŒã©ã«ãããã¯ãŒã¯ã®ç®çé¢æ°ã§ãã亀差ãšã³ããããŒãå©çšã§ããªãç¹ã«ããïŒããã§5 ç« ã§ã¯ïŒæ£åžžé³ãåŸã確çååžãšçµ±èšçã«å·®ç°ãããé³ãç°åžžé³ãšå®çŸ©ããããšã§ç°åžžé³æ€ç¥ã仮説æ€å®ãšã¿ãªãïŒç°åžžé³æ€ç¥åšãæé©åããããã®ç®çé¢æ°ãšããŠïŒä»®èª¬æ€å®ã®æé©ååºæºã§ãããã€ãã³ã»ãã¢ãœã³ã®è£é¡ãã"ãã€ãã³ã»ãã¢ãœã³ææš" ãå°åºããïŒå®éè©äŸ¡è©Šéšã§ã¯ïŒåŸæ¥æ³ãšæ¯ã¹èª¿åå¹³åãåäžããããšããïŒææ¡æ³ãåŸæ¥æ³ãããå®å®ããŠç°åžžé³æ€ç¥ã§ããããšã瀺ããïŒãŸãå®ç°å¢å®éšã§ã¯3D ããªã³ã¿ãé颚ãã³ãã®çªçºçãªç°åžžé³ãïŒãã¢ãªã³ã°ã®å·ãªã©ã«èµ·å ããæç¶çãªç°åžžé³ãæ€ç¥ã§ããããšã瀺ããïŒãã®ææã«ããïŒç°åžžé³ããŒã¿ã®éãŸããªãç¶æ
èå¥åé¡ãå®å®çã«è§£ãããšãå¯èœã«ãªãïŒé声æ€ç¥ãæªç¥è©±è
æ€åºãªã©ã®ã»ãã¥ãªãã£ã®ããã®é³æºæ
å ±æšå®æè¡ãªã©ïŒè² äŸããŒã¿ã®åéãå°é£ãªæ§ã
ãªé³æºæ
å ±æšå®ãžãšå¿çšãã§ããïŒé»æ°é信倧åŠ201