37 research outputs found
Pitch extraction and voiced/unvoiced detection of speech by cross-coupling multi-layered neural network with feedback architecture
Sound generation by unsteady flow ejecting from the vibrating glottis based on a distributed parameter model of the vocal cords
金沢大学大学院自然科学研究科The purpose of the present paper is to clarify the effects of unsteady glottal flow on phonation. We perform numerical experiments with respect to vocal cord vibrations in order to verify the validity of the proposed model for a glottal sound source. In addition, the prediction of pressure waves induced by unsteady glottal jets is attempted. Good agreement between the numerical results and the measured data of the properties of the glottal source indicates that the proposed model is a good tool for the analysis of speech production. Simulated pulsatile glottal jets show the generation of high-frequency noises in a pressure wave at the glottis and the unsteady and asymmetric motion of vortices. These vortices cause amplitude fluctuations in the pressure wave downstream near the glottis, although pressure waves far from the glottis are not greatly affected. In conclusion, the unsteady glottal flow affects only the area near the glottis and does not greatly affect speech waves radiating from the mouth. © 2007 The Acoustical Society of Japan
Effects of the false vocal folds on sound generation by an unsteady glottal jet through rigid wall model of the larynx
金沢大学大学院自然科学研究科In the present paper, the effects of the false vocal folds (FVFs) on sound generation induced by an unsteady glottal jet through a two-dimensional rigid wall model of the larynx are investigated by conducting numerical experiments. The glottal jets are simulated by solving the basic equations for a compressible viscous fluid based on the larynx model with and without the FVFs. The existence of the FVFs increases the amplitude of noise-like pressure fluctuation at the glottis and faraway from the glottis. Furthermore, the FVFs give rise to the broadbanding of the pressure spectrum throughout the fluid domain. These results indicate that the FVFs have a profound effect on the generation of broadband noise components in a speech wave. © 2007 The Acoustical Society of Japan
Pitch extraction and voiced/unvoiced detection of speech by cross-coupling multi-layered neural network with feedback architecture
Pitch frequency is one of the most important voice characteristics, and its accurate extraction is important not only in speech analysis and synthesis, but also in speech coding, speech recognition, speaker recognition, and the like. Existing methods of improving extraction accuracy include waveform processing, correlative processing, and spectral processing. This paper describes the use of a neural network to extract pitch from voice features delivered from the bandpass filter pairs (BPFPs) proposed by Fonda et al. Three types of multi-layered neural networks able to learn time-continuity and high accuracy discrimination functions and have a recurrent structure are tested. The cross-coupling multi-layered neural network with feedback architecture gives the best improvement over conventional neural networks, and exhibits superior ability for learning time continuity of pitch and U/V information. © 1997 Scripta Technica, Inc. Electron Comm Jpn Pt 3, 80(9): 48–58, 1997
Speech Analysis/Synthesis/Conversion by Using Sequential Processing
This paper presents a method for speech analysis/synthesis/conversion by using sequential processing. The aims of rhis method are to improve the quality of synthsized speech and to convert the original speech into another speech of different characteristics. We apply the Kalman Filter for estimating the auto-regressive coefficients of \u27time varying AR model with unknown onput (ARUI model)\u27, which we have proposed to improve the conventinal AR model, and we use a band-pass filter for making \u27a guide signal\u27 to extract the pitch period from the residual signal. These signals are utilized to make the driving source signal in speech synthesis. We also use the for speech conversion, such as in pitch and utterance length. Moreover, we show experomentally that this method can analyze/synthesize/convert speech without causing instability by using the smoothed auto-regressive coefficients
軽度難聴者との対話支援用音声認識・提示システムの開発
本研究は、加齢に伴う難聴などに由来する軽度の聴覚障害者が、音声による対話を行う際に、文字情報を同時表示することにより、聞き返しを減らしてスムーズに対話を進めることを支援するシステムの作成を目的とする研究である。目的からして、システムは小型軽量が望ましく、当初カスタムLSIとマイクロディスプレイの組み合わせで構成することを意図したが、モバイル型の汎用パソコンが小型化されてきているので、マイク内蔵されたモバイルパソコン上でシステムを構築することとした。18年度に行った音声認識は基本的には単語単位の認識を行っていたのでものであったが、そのため登録されていない単語は認識できない不便性をもつため、特定話者の音声に限定してはいるが、認識対象を任意の音節(かな)も認識できるシステムに拡大した。19年度では、音節の系列を音声指令で「かな漢字変換」する実験を行った。その結果、キーボード操作を介さないで任意の文章をモバイル型のディスプレイ装置に提示できるシステムを開発した。また、音節の誤認識が発生する問題に対処するため、音節認識結果の第2候補まで提示し、音声指令で第2候補の音節を採用できる方式を組み込むことにより、誤認識による煩わしさを軽減することができた。しかし、音節あたりの認識時間と誤認識率にはトレードオフの関係があり、誤認識を低く保ちながら認識時間を短縮できる技法を考案することが実用化に向けての課題である。これらの成果の一部については、通信と情報技術に関する国際ワークショップ(ISCIT07)や電子情報通信学会の音声研究会で発表した。This research is to develop a dialog support system for persons such as age-related hearing loss via speech recognition and character display. For this purpose, the system is expected to be small and light. At the beginning of this research, we intended to design the system with combining custom ISI and micro-displays. However since mobile computers with a built-in microphone recently have come onto the market, we changed to implement the system on a mobile computer.At the beginning in 2006, the system was designed to recognize only the pre-registered words, so it shows inconvenience that unregistered words cannot be recognized. Therefore, we improved the system by making it recognize all Japanese 101 syllables (kana). In this case, dialog partner has to speak syllable by syllable and the system display the recognized syllables one after another. It has also inconvenience that the system can display contents of dining by a sequence of only kana character so difficult to read fir the haul of hearing.In 2007, we improved the system in the following three points. The first is to decrease recognition error by incorporating new algorithm to extract noise-robust speech feature proposed by us. The second is to incorporate the "kana to Chinese character conversion" in the system, the command of which is also ordered by speech The third is to make the system display the ascend candidate of recognition, and can select the second one in case of mis-recognition of the first candidate. This selection command is also acceptable by speech. By making the system be possible to operate only by speech command, it has become convenient to use.Further improvement is desired in recognition speed and recognition error.Parts of these results have been presented at technical committee on speech of Acoustic Society of Japan and the Institute of Electronics, Intimation and Communication Engineers, and International Workshop on Communication and Information Technologies (ICSIT07).研究課題/領域番号:18500127, 研究期間(年度):2006-2007出典:「軽度難聴者との対話支援用音声認識・提示システムの開発」研究成果報告書 課題番号18500127 (KAKEN:科学研究費助成事業データベース(国立情報学研究所)) 本文データは著者版報告書より作
Impact of functional studies on exome sequence variant interpretation in early-onset cardiac conduction system diseases
Aims
The genetic cause of cardiac conduction system disease (CCSD) has not been fully elucidated. Whole-exome sequencing (WES) can detect various genetic variants; however, the identification of pathogenic variants remains a challenge. We aimed to identify pathogenic or likely pathogenic variants in CCSD patients by using WES and 2015 American College of Medical Genetics and Genomics (ACMG) standards and guidelines as well as evaluating the usefulness of functional studies for determining them.
Methods and Results
We performed WES of 23 probands diagnosed with early-onset (<65 years) CCSD and analyzed 117 genes linked to arrhythmogenic diseases or cardiomyopathies. We focused on rare variants (minor allele frequency < 0.1%) that were absent from population databases. Five probands had protein truncating variants in EMD and LMNA which were classified as “pathogenic” by 2015 ACMG standards and guidelines. To evaluate the functional changes brought about by these variants, we generated a knock-out zebrafish with CRISPR-mediated insertions or deletions of the EMD or LMNA homologs in zebrafish. The mean heart rate and conduction velocities in the CRISPR/Cas9-injected embryos and F2 generation embryos with homozygous deletions were significantly decreased. Twenty-one variants of uncertain significance were identified in 11 probands. Cellular electrophysiological study and in vivo zebrafish cardiac assay showed that 2 variants in KCNH2 and SCN5A, 4 variants in SCN10A, and 1 variant in MYH6 damaged each gene, which resulted in the change of the clinical significance of them from “Uncertain significance” to “Likely pathogenic” in 6 probands.
Conclusions
Of 23 CCSD probands, we successfully identified pathogenic or likely pathogenic variants in 11 probands (48%). Functional analyses of a cellular electrophysiological study and in vivo zebrafish cardiac assay might be useful for determining the pathogenicity of rare variants in patients with CCSD. SCN10A may be one of the major genes responsible for CCSD.
Translational Perspective
Whole-exome sequencing (WES) may be helpful in determining the causes of cardiac conduction system disease (CCSD), however, the identification of pathogenic variants remains a challenge. We performed WES of 23 probands diagnosed with early-onset CCSD, and identified 12 pathogenic or likely pathogenic variants in 11 of these probands (48%) according to the 2015 ACMG standards and guidelines. In this context, functional analyses of a cellular electrophysiological study and in vivo zebrafish cardiac assay might be useful for determining the pathogenicity of rare variants, and SCN10A may be one of the major development factors in CCSD