5 research outputs found

    From hybrid adjustable neuro-fuzzy systems to adaptive connectionist-based systems for phoneme and word recognition

    No full text
    This paper discusses the problem of adaptation in automatic speech recognition systems (ASRS) and suggests several strategies for adaptation in a modular architecture for speech recognition. The architecture allows for adaptation at different levels of the recognition process, where modules can be adapted individually based on their performance and the performance of the whole system. Two realisations of this architecture are presented along with experimental results from small-scale experiments. The first realisation is a hybrid system for speaker-independent phoneme-based spoken word recognition, consisting of neural networks for recognising English phonemes and fuzzy systems for modelling acoustic and linguistic knowledge. This system is adjustable by additional training of individual neural network modules and tuning the fuzzy systems. The increased accuracy of the recognition through appropriate adjustment is also discussed. The second realisation of the architecture is a connectionist system that uses fuzzy neural networks FuNNs to accommodate both a prior linguistic knowledge and data from a speech corpus. A method for on-line adaptation of FuNNs is also presented.Unpublished[1] S. Amari, N.K. Kasabov (Eds.), Brain-like Computing and Intelligent Information Systems, Springer, Berlin, 1997. [2] Clark, C. Yallop, An Introduction to Phonetics and Phonology, Blackwell, Cambridge MA, 1990. [3] Cole et al., The challenge of spoken language systems: research directions for the Nineties, IEEE Trans. Speech Audio Process. 3 (1) (1995) 1-21. [4] Li-Min Fu, Building expert systems on neural architectures, Proc. Ist IEEE Internat. Conf. on Artificial Neural Networks, 1989, pp. 221-225. [5] Goldberg, Genetic Algorithms in Search, Optimisation and Machine Learning, Addison-Wesley, New York, 1989. [6] Q. Huo, C.-H. Lee, A study of on-line Quasi-Bayes adaptation for CDHMM-based speech recognition, Proc. IEEE Internat. Conf. on Acoustic, Speech, and Signal Processing, 1996, pp. 705-708. [7] J.S.R. Jang, ANFIS: adaptive-network-based fuzzy inference system, IEEETrans. Systems Man Cybernet. 23 (3) (1993) 665-684. [8] N.K. Kasabov, Building comprehensive AI and the task of speech recognition, in: J. Alspector, R. Goodman, T. Brown (Eds.), Applications of Neural Networks to Telecommunications 2, Lawrence Erlbaum, Hillsdale, NJ, 1995, pp. 178-185. [9] N.K. Kasabov, Hybrid connectionist fuzzy production systems - towards building comprehensive AI, Intell. Automat. Soft Comput. 1 (1995) 351-360. [10] N.K. Kasabov, Hybrid Connectionist Fuzzy Rule-based Systems for Speech Recognition, Lecture Notes in Computer Science/Artificial Intelligence, vol. 1011, Springer, Berlin, 1995, pp. 20-33. [11] N.K. Kasabov, Adaptable counectionist production systems, Neurocomputing 13 (1996) 95-117. [12] N.K. Kasabov, Foundations of Neural Networks, Fuzzy Systems and Knowledge Engineering, MIT Press, Cambridge, MA, 1996. [13] N.K. Kasabov, Learning and approximate reasoning in fuzzy neural networks and hybrid systems, Fuzzy Sets and Systems 82 (1996) 135-149. [14] N.K. Kasabov, Learning strategies for modular neuro-fuzzy systems: a case study on phoneme-based speech recognition, J. Intell. Fuzzy Systems 5 (1997) 1-10. [15] N.K. Kasabov, A framework for intelligent conscious machines and applications for adaptive speech recognition, in: Amari, N.K. Kasabov (Eds.), Brain-like Computing and Intelligent Systems, Springer, Berlin, 1997. [16] N.K. Kasabov, ECOS: Evolving connectionist systems - methods, algorithms, applications, in: Proc. ICONIP'98 Conf. (International Conference on Neuro-Information Processing), Kitakyushu, Japan, 21-23 October 1998, pp. 793-796. [17] N.K. Kasabov, J.S. Kim, M. Watts, A. Gray, FuNN/2 - a fuzzy neural network architecture for adaptive learning and knowledge acquisition, Inform. Sci. 101 (3-4) (1997) 155-175. [18] N.K. Kasabov, R. Kozma, M. Watts, Optimization and adaption of fuzzy neural networks through genetic algorithms and learning-with-forgetting methods and applications for phoneme based speech recognition, Inform. Sci. 110 (1998) 61-79. [19] N.K. Kasabov, E. Postma, J. van en Herik, AVIS: a connectionist framework for integrated audio and visual information processing systems, in: Proc. Iizuka'98 Conf., 16-20 October, Iizuka, Japan, 1998. [20] N.K. Kasabov, S.J. Sinclair, R. Kilgour, C. Watson, M. Laws, D. Kassabova, Intelligent human computer interfaces and the case study of building English-to-Māori talking dictionary, in: N.K. Kasabov, G. Coghill (Eds.), Proc. ANNES'95, Dunedin, IEEE Computer Society Press, Los Alamitos, 1995, pp. 294-297. [21] R.I, Kilgour, Hybrid fuzzy systems and neural networks for speech recognition, Unpublished Masters Thesis, University of Otago, 1996. [22] K. Kim, N. Relkin, K.-M. Lee, J. Hirsch, Distinct cortical areas associated with native and second languages, Nature 388 (1997) 171-174. [23] D. Massaro, Perceiving Talking Faces, MIT Press, Cambridge, MA, 1997. [24] D. Massaro, M. Cohen, Integration of visual and auditory information in speech perception, J. Experimental Psychol.: Human Perception Performance 9 (1983) 753-771. [25] Mitra, S. Pal, Fuzzy multi-layer perceptron, inferencing and rule generation, IEEE Trans. Neural Networks 6 (1995) 51-63. [26] Morgan, C. Scofield, Neural Networks and Speech Processing, Kluwer Academic Publishers, Amsterdam, 1991. [27] N. Pal, E. Kumar, Neural networks for dimensionality reduction, in: Kasabov et al. (Eds.), Connectionist Based Information Systems, Proc. ICONIP'97 Conf., Dunedin, Springer, Singapore, 1997, pp. 221-224. [28] R. Rabiner, Applications of voice processing to telecommunications, Proc. IEEE 82 (1994) 199-228. [29] D. Robinson, Artificial Intelligence and Expert Systems, McGraw Hill, New York, 1988. [30] G. Rummery, M. Niranjan, On-line Q-learning using connectionist systems, CUED/F-INFENG/TR 166, Cambridge University Engineering Department, 1994. [31] A. Sankar, L. Neumeyer, M. Weintraub, An experimental study of acoustic adaptation algorithms, Proc. IEEE Internat. Conf. on Acoustic, Speech, and Signal Processing, 1996, pp. 713-716. [32] M.A-S. Seyed, Bayesian and predictive techniques for speaker adaptation, Unpublished PhD Thesis, University of Cambridge, 1996. [33] S.J. Sinclair, Development of an isolated speech digit recognition system based on backpropagation neural networks, Unpublished Masters Thesis, University of Otago, 1996. [34] S.J. Sinclair, C. Watson, The development of the Otago speech database, in: N. Kasabov, G. Coghill (Eds.), Proc. ANNES '95, Dunedin, IEEE Computer Society Press, Los Alamitos, 1995, pp. 294-297. [35] T. Yamakawa, H. Kusanagi, E. Uchino, T. Miki, A new effective algorithm for neo fuzzy neuron model, in: Proc. 5th IFSA World Congress, 1993, pp. 1017-1020. [36] Yamazaki, Research activities on spontaneous speech, in: N. Kasabov, G. Coghill (Eds.), Proc. ANNES '95, Dunedin, IEEE Computer Society Press, Los Alamitos, 1995, pp. 280-283. [37] S. Young, Large vocabulary continuous speech recognition: a review, Internal Report, Cambridge University Engineering Department, 1996

    Modelling the emergence of speech sound categories in evolving connectionist systems

    Get PDF
    We report on the clustering of nodes in internally represented acoustic space. Learners of different languages partition perceptual space distinctly. Here, an Evolving Connectionist-Based System (ECOS) is used to model the perceptual space of New Zealand English. Currently, the system evolves in an unsupervised, self-organising manner. The perceptual space can be visualised, and the important features of the input patterns analysed. Additionally, the path of the internal representations can be seen. The results here will be used to develop a supervised system that can be used for speech recognition based on the evolved, internal sub-word units.Unpublished[1] P. Jusczyk, The Discovery of Spoken Language, Cambridge, MA: MIT Press, 1997. [2] P. K. Kuhl, "Speech Perception," in Introduction to Communication Sciences and Disorders, F. Minifie, Ed., San Diego, CA: Singular Pub Group, 1994, pp. 77-142. [3] P. Lieberman, Uniquely Human: The Evolution of Speech, Thought, and Selfless Behavior, Cambridge, MA: Harvard University Press, 1991 [4] Liberman, Speech: A Special Code, Cambridge, MA: MIT Press, 1996. [5] N. Chomsky, The Minimalist Program, Cambridge, MA: MIT Press, 1995. [6] M. S. Seidenberg, "Language acquisition and use: Learning and applying probabilistic constraints," Science, vol. 275, pp. 1599-1603, 1997. [7] E. Bates and J. Elman, "Learning rediscovered," Science, vol. 274, pp 1849-1850, 1996. [8] K. Plunkett, "Connectionist approaches to language acquisition," in The Handbook of Child Language, P. Fletcher and B. MacWhinney, Eds., Oxford: Blackwell, 1995, pp. 36-72. [9] N. Kasabov, "The ECOS framework and the 'eco' training method for evolving connectionist systems," Journal of Advanced Computational Intelligence, vol. 2, no. 6, pp. 195-202, 1998. [10] N. Kasabov, "Evolving fuzzy neural networks: Theory and applications for on-line adaptive prediction, decision making and control," Australian Journal of Intelligent Information Processing Systems, vol. 5 (3), pp. 154-160, 1998. [11] N. Kasabov, "Evolving connectionist and fuzzy connectionist systems – theory and applications for adaptive, on-line intelligent systems," in Neuro-Fuzzy Techniques for Intelligent Information Systems, N. Kasabov and R. Kozma, Eds., Heidelberg: Physica Verlag, 1999, pp. 111-146. [12] S. Sinclair, and C. Watson, "The Development of the Otago Speech Database," in Proceedings of ANNES ’95, 1995, pp. 298-301. [13] N. Kasabov, R. Kilgour and S. Sinclair, "From hybrid adjustable neuro-fuzzy systems to adaptive connectionist-based systems for phoneme and word recognition," Fuzzy Sets and Systems, 130 (2), 1999. [14] N. Kasabov, "A framework for intelligent conscious machines and its application to multilingual speech recognition systems," Brain-like computing and intelligent information systems, S. Amari and N. Kasabov, Eds., Singapore: Springer Verlag, 1998

    Modelling the emergence of speech sound categories in evolving connectionist systems

    No full text
    We report on the clustering of nodes in internally represented acoustic space. Learners of different languages partition perceptual space distinctly. Here, an Evolving Connectionist-Based System (ECOS) is used to model the perceptual space of New Zealand English. Currently, the system evolves in an unsupervised, self-organising manner. The perceptual space can be visualised, and the important features of the input patterns analysed. Additionally, the path of the internal representations can be seen. The results here will be used to develop a supervised system that can be used for speech recognition based on the evolved, internal sub-word units.Unpublished[1] P. Jusczyk, The Discovery of Spoken Language, Cambridge, MA: MIT Press, 1997. [2] P. K. Kuhl, "Speech Perception," in Introduction to Communication Sciences and Disorders, F. Minifie, Ed., San Diego, CA: Singular Pub Group, 1994, pp. 77-142. [3] P. Lieberman, Uniquely Human: The Evolution of Speech, Thought, and Selfless Behavior, Cambridge, MA: Harvard University Press, 1991 [4] Liberman, Speech: A Special Code, Cambridge, MA: MIT Press, 1996. [5] N. Chomsky, The Minimalist Program, Cambridge, MA: MIT Press, 1995. [6] M. S. Seidenberg, "Language acquisition and use: Learning and applying probabilistic constraints," Science, vol. 275, pp. 1599-1603, 1997. [7] E. Bates and J. Elman, "Learning rediscovered," Science, vol. 274, pp 1849-1850, 1996. [8] K. Plunkett, "Connectionist approaches to language acquisition," in The Handbook of Child Language, P. Fletcher and B. MacWhinney, Eds., Oxford: Blackwell, 1995, pp. 36-72. [9] N. Kasabov, "The ECOS framework and the 'eco' training method for evolving connectionist systems," Journal of Advanced Computational Intelligence, vol. 2, no. 6, pp. 195-202, 1998. [10] N. Kasabov, "Evolving fuzzy neural networks: Theory and applications for on-line adaptive prediction, decision making and control," Australian Journal of Intelligent Information Processing Systems, vol. 5 (3), pp. 154-160, 1998. [11] N. Kasabov, "Evolving connectionist and fuzzy connectionist systems – theory and applications for adaptive, on-line intelligent systems," in Neuro-Fuzzy Techniques for Intelligent Information Systems, N. Kasabov and R. Kozma, Eds., Heidelberg: Physica Verlag, 1999, pp. 111-146. [12] S. Sinclair, and C. Watson, "The Development of the Otago Speech Database," in Proceedings of ANNES ’95, 1995, pp. 298-301. [13] N. Kasabov, R. Kilgour and S. Sinclair, "From hybrid adjustable neuro-fuzzy systems to adaptive connectionist-based systems for phoneme and word recognition," Fuzzy Sets and Systems, 130 (2), 1999. [14] N. Kasabov, "A framework for intelligent conscious machines and its application to multilingual speech recognition systems," Brain-like computing and intelligent information systems, S. Amari and N. Kasabov, Eds., Singapore: Springer Verlag, 1998

    Evolving systems for connectionist-based speech recognition

    No full text
    xv, 519 p. ; 30 cm. Includes bibliographical references. University of Otago department: Information Science. "June 18, 2003".Although studied for several years, speech recognition is still a field that is developing. Recently several important researchers have pointed out areas within the field that need to be addressed. These include robustness to various environments, large or expandable vocabularies, user-friendliness, high recognition accuracy and the ability to recognise continuous speech. The ability to adapt is an important component of a speech recognition system. People new to the system should have the benefits mentioned above. The system should also manage recognition of different speaking rates. Also, novel environments may cause a drop in the system's performance if it lacks robustness or the ability to adapt. A common target for speech recognition algorithms is to detect the presence of speech units, commonly phonemes. This approach involves grouping speech sounds, or phones, into abstract groups that reflect meaning. Recently artificial neural networks have been applied to this task. Nevertheless, uncertainty and ambiguity are inherent in the neural network recognition process. Several novel techniques are proposed to aid in the recognition process, and to help to fulfil the requirements of a successful speech recognition system. The goal of this research is to investigate theories of speech and language processing that are relevant to speech recognition and spoken language understanding. These theories have their foundations in fields such as engineering, computer science, linguistics, natural language processing, psycholinguistics and psychology. An adaptive system is implemented to test the validity and usefulness of such work to the fields of speech recognition and spoken language understanding. For example, the development of abstract structures of the human auditory system and the auditory cortex are investigated, and applied towards better engineering methods for building adaptive speech and language systems. For the implementation of an adaptive speech recognition system, parameters are introduced that can be adjusted either manually or automatically. In this manner, the system can adapt to new speakers and environments. The architecture of the system is modular and hierarchical. Different methods are applied at various levels. For example, artificial neural networks are best suited for low-level processing. A discussion of how errors and uncertainty may be resolved in an unsupervised manner concludes the work. Ideally, the system will adapt to the situation, and the future occurrences of such phenomena may be reduced or eliminated.UnpublishedAbu Hosan, R., Boucher, P., Brugnara, F., De Mori, R., Galler, M., and Snow, M. (1995). Acoustic modeling. Annual report 1995, Centre for Intelligent Machines, McGill University. Barras, C., Caraty, M., and Montacie, C. (1995). Temporal control and training selection for hmm based system. In Eurospeech 95. Bartlett, C. (1992). Regional variation in New Zealand English: The case of Southland. New Zealand English Newsletter, 6:5-15. Bayard, D. and Bartlett, C. (1996). "you must be from Gorrre": Attitudinal effects of Southland rhotic accents and speaker gender on NZE listeners and the question of NZE regional variation. Te Reo, 39:25-45. Bell, A. (1997). Those short front vowels. New Zealand English Journal, 11:3-13. Bengio, Y. (1999). Markovian models for sequential data. Neural Computing Surveys, 2:129-162. Bengio, Y., De Mori, R., and Cardin, R. (1990). Speaker independent speech recognition with neural networks and speech knowledge. In Touretzky, D. E., editor, Advances in Neural Information Processing Systems 2, pages 218-225. Morgan Kaufmann. Bergland, G. D. (1969). A guided tour of the fast fourier transformation. IEEE Spectrum, pages 41-52. Berndt, R. S., Caramazza, A., and Zurif, E. (1983). Language functions: Syntax and semantics. In Segalowitz, S. J., editor, Language Functions and Brain Organization, pages 5-28. Academic Press, New York. Bertoncini, J. B., Bijeljac-Babic, R., Jusczyk, P. W., Kennedy, J. L., and Mehler, J. (1988). An investigation of young infants' perceptual representations of speech sounds. Journal of Experimental Psychology: General, 117(1):21-33. Black, A. W. and Taylor, P. (1994). CHATR: A generic speech synthesis system. In COLING-94, volume 2, pages 983-986, Kyoto, Japan. Black, A. W., Taylor, P., and Caley, R. (1999). The Festival speech synthesis system. System Documentation Edition 1.3, University of Edinburgh. Brennan, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1993). Classification and Regression Trees. The Wadsworth statistics/probability series. Chapman & Hall, New York, NY. Burgess, N. (1994). A constructive algorithm that converges for real-valued input patterns. International Journal of Neural Systems, 5(1):59-66. Campbell, N. (1996). CHATR: A high-definition speech re-sequencing system. In Acoustical Society of America and Acoustical Society of Japan Third Joint Meeting. Carpenter, G. A., Grossberg, S., Markuzon, N., Reynolds, J. H., and Rosen, D. B. (1992). Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3:698-713. Carpenter, G. A. and Tan, A. (1995). Rule extraction: From neural architecture to symbolic representation. Connection Science, 7(1):3-27. Cassidy, S. (1999). Compiling multi-tiered speech databases into the relational model: Experiments with the Emu system. In Proszeky, G., Nemeth, G., and Mandli, J., editors, EuroSpeech, volume 5, pages 2239-2242, Budapest, Hungary. Chang, J. and Glass, J. (1997). Segmentation and modeling in segment-based recognition. In Proc. Eurospeech 1997, pages 1199-1202. Chen, S. and Liao, Y. (1998). Modular recurrent neural networks for mandarin syllable recognition. IEEE Transactions on Neural Networks, 9(6):1430-1441. Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In Kingston, J. and Beckman, M., editors, Papers in Laboratory Phonology I. Cambridge University Press, Cambridge. Cole, R., Hirschman, L., Atlas, L., Beckman, M., Biermann, A., Bush, M., Clements, M., Cohen, J., Garcia, 0., Hanson, B., Hermansky, H., Levinson, S., McKeown, K., Morgan, N., Novick, D., Ostendorf, M., Oviatt, S., Price, P., Silverman, H., Spitz, J., Waibel, A., Weinstein, C., Zahorian, S., and Zue, V. (1995). The challenge of spoken language systems research directions for the nineties. IEEE Transactions on Speech and Audio Processing, 3:1-21. Cole, R. A., Muthusamy, Y., and Fanty, M. A. (1990). The ISOLET spoken letter database. Technical Report 90-004, Oregon Graduate Institute. Craven, M. W. and Shavlik, J. W. (1993). Learning symbolic rules using artificial neural networks. In Proceedings of the Tenth International Conference on Machine Learning, pages 73-80, Amherst , MA. Craven, M. W. and Shavlik, J. W. (1994). Using sampling and queries to extract rules from trained neural networks. In Cohen, W. W. and Hirsh, H., editors, Machine Learning: Proceedings of the Eleventh International Conference, San Francisco, CA. Morgan Kaufmann. Craven, M. W. and Shavlik, J. W. (1997). Using neural networks for data mining. Future Generation Computer Systems, 13(Special Issue on Data Mining):211-229. Craven, M. W. and Shavlik, J. W. (1999). Rule extraction: Where do we go from here? Working Paper 99-1, University of Wisconsin Machine Learning Research Group. Date, C. J. (1990). An Introduction to Database Systems, volume 1. Addison-Wesley, Reading, MA, 5 edition. Davis, S. B. and Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357-366. Dehaene-Lambertz, G. and Baillet, S. (1998). A phonological representation in the infant brain. NeuroReport, 9(8):1885-1888. Deverson, T. (1990). `woman's consistancy': A distinctive zero plural in New Zealand English. Te Reo, 33:43-56. Dongxin, X., Taiyi, H., and Zhiwei, L. (1990). A hierarchical structure for feed-forward neural networks and its application to speaker-independent speech recognition. In 10th International Conference on Pattern Recognition. IEEE Computer Society Press. Elman, J. L. (1990). Finding structures in time. Cognitive Sciences, 14:179-211. Esparcia-Alcazar, A. I. and Sharman, K. C. (1996). Evolving recurrent neural network architectures by genetic programming. Technical Report CSC-96009, Centre for Systems and Control, University of Glasgow. Fahlman, S. E. and Lebiere, C. (1990). The cascade-correlation learning architecture. Technical Report CMU-CS-90-100, School of Computer Science, Carnegie Mellon University. Feldkamp, L. A., Puskorius, G. V., Yuan, F., and Davis, Jr., L. I. (1992). Architecture and training of a hybrid neural-fuzzy system. In international conference on Fuzzy Logic Neural Networks, pages 131-134, Iizuka, Japan. Fletcher, J. and Obradovic, Z. (1993). Combining prior symbolic knowledge and constructive neural network learning. Connection Science, 5(3 & 4):365-375. Foldi, N. S., Cicone, M., and Gardner, H. (1983). Pragmatic aspects of communication in brain damaged patients. In Segalowitz, S. J., editor, Language Functions and Brain Organization, pages 55-86. Academic Press, New York. Fowler, C. A., Best, C. T., and McRoberts, G. W. (1990). Young infants' perception of liquid coarticulatory influences on following stop consonants. Perception and Psychophysics, 48(6):559-570. Franzini, M. A., Witbrock, M. J., and Lee, K. (1989). Speaker-independent recognition of connected utterances using recurrent and non-recurrent neural networks. In Proceedings of the International Joint Conference on Neural Networks, Vol. 2, pages 1-6. IEEE. Frean, M. (1990). The Upstart algorithm: A method for constructing and training feedforward neural networks. Neural Computation, 2:198-209. Gallinari, P. (1995). Training of modular neural net systems. In Arbib, M., editor, The Handbook of Brain Theory and Neural Networks, pages 582-585. MIT Press. Garrett, M. (1994). The structure of language processing: Neuropsychological evidence. In Gazzaniga, M., editor, The Cognitive Neurosciences, pages 881-899. MIT press, Cambridge, MA. Ghada, N. A. and Mohamed, A. S. (2000). Evolution of recurrent cascade correlation networks with a distributed collaborative species. In The First IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks, page To Appear, San Antonio, TX,. Giegerich, H. J. (1992). English phonology: an Introduction. University Press. Gimson, A. C. and Cruttenden, A. (1994). Gimson's Pronunciation of English. Hodder and Stoughton, fifth edition. Goldberg, D. (1989). Genetic Algorithms is Search, Optimization and Machine Learning. Addison Wesley, New York. Gordon, E. and Deverson, T. (1985). New Zealand English. Heinemann. Gordon, E. and Maclagan, M. (1990). A longitudinal study of the 'ear'/'air' contrast in New Zealand speech. In Bell, A. and Holmes, J., editors, New Zealand ways of speaking English, pages 129-148. Multilingual Matters, Bristol. Gordon, E. and Maclagan, M. (1995). The changing sound of new zealand english. The New Zealand Speech-Language Therapists' Journal, 50:32-40. Gordon, E. and MacLagan, M. A. (1983). A study of the /ia/ - /ea/ contrast in New Zealnd English. The New Zealand Speech Language Therapists' Journal, November:16-26. Gordon, E. and Trudgill, P. (1999). Shades of things to come: Embryonic varients in New Zealand English sound changes. English World Wide, 21(1):111-124. Gori, M., Bengio, Y., and De Mori, R. (1989). BPS: A learning algorithm for capturing the dynamic nature of speech. In Proceedings of the International Joint Conference on Neural Networks, Vol. 2, pages 417-423. IEEE. Haggo, D. (1984). Transcribing New Zealand English vowels. Te Reo, 27:63-67. Hampshire, J. B. and Waibel, A. (1990). Connectionist architectures for multi-speaker phoneme recognition. In Touretzky, D. E., editor, Advances in Neural Information Processing Systems 2, pages 203-210. Morgan Kaufmann. Handel, S. (1991). The physiology of listening. In Listening: An Introduction to the Perception of Auditory Events, pages 461-545. Bradford, second edition. Harris, F. J. (1978). On the use of windows for harmonic analysis with the discrete Fourier transform. Proceeding of the IEEE, 66:51-83. Hertz, J., Krough, A., and Palmer, R. G. (1991). Introduction to the Theory of Neural Computation. Addison-Wesley. Hieronymus, J. L. (1994). ASCII phonetic symbols for the world's languages: Worldbet. Technical report, AT&T. Hilario, M., Pellegrini, C., and Alexandre, F. (1994). Modular integration of connectionist and symbolic processing in knowledge-based systems. In Proc. ISIKNH'94: International Symposium on Integrating Knowledge and Neural Heuristics, pages 123-132, Pensacola, Florida. Holmes, J. (1995a). Three chairs for New Zealand English: the EAR/AIR merger. English Today, 11(3):14-18. Holmes, J. (1995b). Two for /t/: flapping and glottal stops in New Zealand English. Te Reo: Journal of the Linguistic Society of New Zealand, 38:53-72. Holmes, J. (1995c). The Wellington Corpus of Spoken New Zealand English: A progress report. New Zealand English Journal, 9:5-8. Holmes, J. (1996). The New Zealand spoken component of ICE: Some methodological challanges. In Greenbaum, S., editor, Comparing English Worldwide, pages 163-181. Oxford University Press, Oxford. Holmes, J. (1997). T-time in New Zealand. English Today, 13(3):18-22. Jankowski, N. and Kadirkamanathan, V. (1997b). Statistical control of growing and pruning in rbf-like neural networks. In In Third Conference on Neural Networks and Their Applications, pages 663-670, Kule, Poland. Jankowski, N. and Kadirkamanathan, V. (1997a). Statistical control of rbf-like networks for classification. In In 7th International Conference on Artificial Neural Networks, pages 385- 390, Lausanne, Switzerland. Springer-Verlag. Jensen, J. T. (1993). English Phonology. John Benjamin. Jordan, M. I. (1989). Serial order: A parallel, distributed processing approach. In Elman and Rumelhart, editors, Advances in Connectionist Theory: Speech. Erlbaum. Jusczyk, P. W., Friederici, A. D., Wessles, J. M. I., Svenkerud, V. Y., and Jusczyk, A. M. (1993). Infants' sensitivity to the sound patterns of native language words. Journal of Memory and Language, 32:402-420. Kadous, M. W. (1999). Learning comprehensible descriptions of multivariate time series. In Proc. 16th International Conf. on Machine Learning, pages 454-463. Kasabov, N. (1994). Towards using hybrid connectionist fuzzy production systems for speech recognition. In Proceedings of the WWW, Nagoya, Japan. Kasabov, N. (1996a). Adaptable connectionist productionist systems. Neurocomputing, 13:95- 117. Kasabov, N. (1996b). Foundations of Neural Networks, Fuzzy Systems and Knowledge Engineering. MIT Press, Cambridge, MA. Kasabov, N. (1996c). Learning and aproximate reasoning in fuzzy neural networks and hybrid systems. Fuzzy Sets and Systems, 82:135-149. Kasabov, N. (2000). Evolving fuzzy neural networks for supervised/unsupervised on-line, knowledge-based learning. IEEE Transactions on Man, Machine and Cybernetics; Part B: Cybernetics, To Appear. Kasabov, N., Kilgour, R., and Sinclair, S. (1999a). From hybrid adjustable neuro-fuzzy systems to adaptive connectionist-based systems for phoneme and word recognition. Fuzzy Sets and Systems, 103:349-367. Kasabov, N., Kim, J. S., Watts, M., and Gray, A. (1997). FuNN/2 - a fuzzy neural network architecture for adaptive learning and knowledge acquisition. Information Sciences - Applications, 101(3-4):155-175. Kasabov, N., Kozma, R., Kilgour, R., Laws, M., Watts, M., Gray, A., and Taylor, J. (1999b). Speech data analysis and recognition using fuzzy neural networks and self-organising maps. In Kasabov, N. and Kozma, R., editors, Neuro-Fuzzy Techniques for intelligent Information Systems, Studies in Fuzziness and Soft Computing, pages 241-263. Physica-Verlag, New York. Kasabov, N., Kozma, R., and Watts, M. (1998). Optimization and adaption of fuzzy neural networks through genetic algorithms and learning-with-forgetting methods and applications to phoneme based speech recognition. Information Sciences - Applications, 13. Kasabov, N., Sinclair, S., Kilgour, R., Watson, C., Laws, M., and Kasabova, D. (1995). Intelligent human to computer interfaces and the case study of building english-to-maori talking dictionary. In Kasabov, N. and Coghill, G., editors, Proceedings of ANNES '95, pages 294- 297. IEEE Computer Society Press. Kasabov, N., Watson, C., Sinclair, S., and Kilgour, R. (1994). Integrating neural networks and fuzzy systems for speech recognition. In Togneri, R., editor, Proceedings of the Fifth Australian International Conference on Speech Science and Technology, pages 462-467. Uniprint. Keegan, P. (1997). Kimikupu hou maori lexical database on the web: Reflections and possible future directions. In Proceedings NAMMSAT Conference, Massey University, Palmerston North. Kilgour, R. and Gray, A. (1997). A comparison of neural and statistical methods for phoneme recognition. In Proceeding of ICONIP (Addendum), pages 61-65. Kilgour, R. I. (1998). Hybrid Fuzzy Systems and Neural Networks for Speech Recognition. Masters thesis, University of Otago. Kilgour, R. I., Abdulla, W. H., and Kasabov, N. K. (1999). Using phoneme confusion and trigram matching to improve phoneme to grapheme transcription. In 6th International Conference on Neural Information Processing, page (Accepted), Perth. Kosko, B. (1992). Neural Networks and Fuzzy Systems. Prentice Hall. Kuhl, P. K. (1991). Human adults and human infants show a "perceptual magnet effect" for the prototypes of speech categories, monkeys do not. Perception and Psychophysics, 50(2):93- 107. Kuhl, P. K. (1994). Speech perception. In Ninifie, F., editor, Introduction to Communication Sciences and Disorders, pages 77-142. Singular Pub. Group, San Diago. Kuhn, R., Perronnin, F., Nguyen, P., Junqua, J., and Rigazio, L. (2001). Very fast adaptation with a compact context-dependent eigenvoice model. In ICASSP. Kwok, T. and Yeung, D. (1999). Constructive algorithms for structure learning in feedforward neural networks for regression problems. IEEE Transactions on Neural Networks, To Appear. Ladefoged, P. and Maddieson, I. (1996). The sounds of the world's languages. Blackwell, Cambridge, MA. Laws, M. (1997). Integrating text and speech into databases, information systems and knowledge engineering for human computer interaction. the progressive development of an integrated bilingual interface. In NAMMSAT '97, Massey University, Palmerston North. Laws, M. and Kilgour, R. (1998). MOOSE: Management of otago speech environment. In Proceedings of ICSLP '98, Sydney, Australia. Laws, M., Kilgour, R. I., and Watts, M. (2000). Analysis of the New Zealand English and Maori on-line translator. In Proceedings of JCIS 2000, volume 2, pages 848-851, Atlantic City, NJ. Laws, M. R. (1998). A Bilingual Speech Interface for New Zealand English to Maori. Masters thesis, University of Otago. Lee, S. J., Kim, C., Yoon, H., and Cho, J. W. (1991). Application of fully recurrent neural networks for speech recognition. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pages 77-80. IEEE. Lee, T. and Ching, P. C. (1996). On improving discrimination capability of an RNN based recognizer. In ICSLP 96, volume 1, pages 526-529, Philadelphia. Leerink, L. R. and Jabri, M. (1993). Improved phoneme recognition using multi-module recurrent neural networks. In Proceedings of the Fourth Australian Conference on Neural Networks, pages 26-29. Leung, H. C., Glass, J. R., Phillips, M. S., and Zue, V. W. (1990). Phonetic classification and recognition using the multi-layer perceptron. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pages 525-528. IEEE. Lewis, C. (1996). The Origins of New Zealand English: A report on work in progress. New Zealand English Journal, 10:25-30. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., and Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6):431-461. Maclagan, M. and Gordon, E. (1999). Data for New Zealand social dialectology: the Canterbury Corpus. New Zealand English Journal, 13:50-58. Maclagan, M. A. (1998). Diphthongisation of /e/ in NZE: a change that went nowhere? New Zealand English Journal, 12:43-54. Mahoney, J. and Mooney, R. (1994). Modifying network architectures for certainty-factor rulebase revision. In Proceedings of the International Symposium on Integrating Knowledge and Neural Heuristics, pages 75-85, Pensacola, FL. Mahoney, J. J. and Mooney, R. J. (1993). Combining connectionist and symbolic learning to redine certainty factor rule bases. Connection Science, 5(3 & 4):339-364. Marean, G. C., Werner, L. A., and Kuhl, P. K. (1992). Vowel categorization by very young infants. Developmental Psychology, 28(3):396-405. Masters, T. (1995). Advanced Algorithms for Neural Networks: A C++ Sourcebook. John Wiley and Sons, New York, NY. Miller, G. A. (1981). Language and Speech. W. H. Freeman. Miller, G. A. and Nicely, P. E. (1955). An analysis of perceptual confusions among some english consonants. Journal of the Acoustical Society of America, 27:338-46. Mitra, S. and Pal, S. K. (1992). Rule generation and inferencing with a layer

    Evolving systems for connectionist-based speech recognition

    No full text
    xv, 519 p. ; 30 cm. Includes bibliographical references. University of Otago department: Information Science. "June 18, 2003".Although studied for several years, speech recognition is still a field that is developing. Recently several important researchers have pointed out areas within the field that need to be addressed. These include robustness to various environments, large or expandable vocabularies, user-friendliness, high recognition accuracy and the ability to recognise continuous speech. The ability to adapt is an important component of a speech recognition system. People new to the system should have the benefits mentioned above. The system should also manage recognition of different speaking rates. Also, novel environments may cause a drop in the system's performance if it lacks robustness or the ability to adapt. A common target for speech recognition algorithms is to detect the presence of speech units, commonly phonemes. This approach involves grouping speech sounds, or phones, into abstract groups that reflect meaning. Recently artificial neural networks have been applied to this task. Nevertheless, uncertainty and ambiguity are inherent in the neural network recognition process. Several novel techniques are proposed to aid in the recognition process, and to help to fulfil the requirements of a successful speech recognition system. The goal of this research is to investigate theories of speech and language processing that are relevant to speech recognition and spoken language understanding. These theories have their foundations in fields such as engineering, computer science, linguistics, natural language processing, psycholinguistics and psychology. An adaptive system is implemented to test the validity and usefulness of such work to the fields of speech recognition and spoken language understanding. For example, the development of abstract structures of the human auditory system and the auditory cortex are investigated, and applied towards better engineering methods for building adaptive speech and language systems. For the implementation of an adaptive speech recognition system, parameters are introduced that can be adjusted either manually or automatically. In this manner, the system can adapt to new speakers and environments. The architecture of the system is modular and hierarchical. Different methods are applied at various levels. For example, artificial neural networks are best suited for low-level processing. A discussion of how errors and uncertainty may be resolved in an unsupervised manner concludes the work. Ideally, the system will adapt to the situation, and the future occurrences of such phenomena may be reduced or eliminated.UnpublishedAbu Hosan, R., Boucher, P., Brugnara, F., De Mori, R., Galler, M., and Snow, M. (1995). Acoustic modeling. Annual report 1995, Centre for Intelligent Machines, McGill University. Barras, C., Caraty, M., and Montacie, C. (1995). Temporal control and training selection for hmm based system. In Eurospeech 95. Bartlett, C. (1992). Regional variation in New Zealand English: The case of Southland. New Zealand English Newsletter, 6:5-15. Bayard, D. and Bartlett, C. (1996). "you must be from Gorrre": Attitudinal effects of Southland rhotic accents and speaker gender on NZE listeners and the question of NZE regional variation. Te Reo, 39:25-45. Bell, A. (1997). Those short front vowels. New Zealand English Journal, 11:3-13. Bengio, Y. (1999). Markovian models for sequential data. Neural Computing Surveys, 2:129-162. Bengio, Y., De Mori, R., and Cardin, R. (1990). Speaker independent speech recognition with neural networks and speech knowledge. In Touretzky, D. E., editor, Advances in Neural Information Processing Systems 2, pages 218-225. Morgan Kaufmann. Bergland, G. D. (1969). A guided tour of the fast fourier transformation. IEEE Spectrum, pages 41-52. Berndt, R. S., Caramazza, A., and Zurif, E. (1983). Language functions: Syntax and semantics. In Segalowitz, S. J., editor, Language Functions and Brain Organization, pages 5-28. Academic Press, New York. Bertoncini, J. B., Bijeljac-Babic, R., Jusczyk, P. W., Kennedy, J. L., and Mehler, J. (1988). An investigation of young infants' perceptual representations of speech sounds. Journal of Experimental Psychology: General, 117(1):21-33. Black, A. W. and Taylor, P. (1994). CHATR: A generic speech synthesis system. In COLING-94, volume 2, pages 983-986, Kyoto, Japan. Black, A. W., Taylor, P., and Caley, R. (1999). The Festival speech synthesis system. System Documentation Edition 1.3, University of Edinburgh. Brennan, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1993). Classification and Regression Trees. The Wadsworth statistics/probability series. Chapman & Hall, New York, NY. Burgess, N. (1994). A constructive algorithm that converges for real-valued input patterns. International Journal of Neural Systems, 5(1):59-66. Campbell, N. (1996). CHATR: A high-definition speech re-sequencing system. In Acoustical Society of America and Acoustical Society of Japan Third Joint Meeting. Carpenter, G. A., Grossberg, S., Markuzon, N., Reynolds, J. H., and Rosen, D. B. (1992). Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3:698-713. Carpenter, G. A. and Tan, A. (1995). Rule extraction: From neural architecture to symbolic representation. Connection Science, 7(1):3-27. Cassidy, S. (1999). Compiling multi-tiered speech databases into the relational model: Experiments with the Emu system. In Proszeky, G., Nemeth, G., and Mandli, J., editors, EuroSpeech, volume 5, pages 2239-2242, Budapest, Hungary. Chang, J. and Glass, J. (1997). Segmentation and modeling in segment-based recognition. In Proc. Eurospeech 1997, pages 1199-1202. Chen, S. and Liao, Y. (1998). Modular recurrent neural networks for mandarin syllable recognition. IEEE Transactions on Neural Networks, 9(6):1430-1441. Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In Kingston, J. and Beckman, M., editors, Papers in Laboratory Phonology I. Cambridge University Press, Cambridge. Cole, R., Hirschman, L., Atlas, L., Beckman, M., Biermann, A., Bush, M., Clements, M., Cohen, J., Garcia, 0., Hanson, B., Hermansky, H., Levinson, S., McKeown, K., Morgan, N., Novick, D., Ostendorf, M., Oviatt, S., Price, P., Silverman, H., Spitz, J., Waibel, A., Weinstein, C., Zahorian, S., and Zue, V. (1995). The challenge of spoken language systems research directions for the nineties. IEEE Transactions on Speech and Audio Processing, 3:1-21. Cole, R. A., Muthusamy, Y., and Fanty, M. A. (1990). The ISOLET spoken letter database. Technical Report 90-004, Oregon Graduate Institute. Craven, M. W. and Shavlik, J. W. (1993). Learning symbolic rules using artificial neural networks. In Proceedings of the Tenth International Conference on Machine Learning, pages 73-80, Amherst , MA. Craven, M. W. and Shavlik, J. W. (1994). Using sampling and queries to extract rules from trained neural networks. In Cohen, W. W. and Hirsh, H., editors, Machine Learning: Proceedings of the Eleventh International Conference, San Francisco, CA. Morgan Kaufmann. Craven, M. W. and Shavlik, J. W. (1997). Using neural networks for data mining. Future Generation Computer Systems, 13(Special Issue on Data Mining):211-229. Craven, M. W. and Shavlik, J. W. (1999). Rule extraction: Where do we go from here? Working Paper 99-1, University of Wisconsin Machine Learning Research Group. Date, C. J. (1990). An Introduction to Database Systems, volume 1. Addison-Wesley, Reading, MA, 5 edition. Davis, S. B. and Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4):357-366. Dehaene-Lambertz, G. and Baillet, S. (1998). A phonological representation in the infant brain. NeuroReport, 9(8):1885-1888. Deverson, T. (1990). `woman's consistancy': A distinctive zero plural in New Zealand English. Te Reo, 33:43-56. Dongxin, X., Taiyi, H., and Zhiwei, L. (1990). A hierarchical structure for feed-forward neural networks and its application to speaker-independent speech recognition. In 10th International Conference on Pattern Recognition. IEEE Computer Society Press. Elman, J. L. (1990). Finding structures in time. Cognitive Sciences, 14:179-211. Esparcia-Alcazar, A. I. and Sharman, K. C. (1996). Evolving recurrent neural network architectures by genetic programming. Technical Report CSC-96009, Centre for Systems and Control, University of Glasgow. Fahlman, S. E. and Lebiere, C. (1990). The cascade-correlation learning architecture. Technical Report CMU-CS-90-100, School of Computer Science, Carnegie Mellon University. Feldkamp, L. A., Puskorius, G. V., Yuan, F., and Davis, Jr., L. I. (1992). Architecture and training of a hybrid neural-fuzzy system. In international conference on Fuzzy Logic Neural Networks, pages 131-134, Iizuka, Japan. Fletcher, J. and Obradovic, Z. (1993). Combining prior symbolic knowledge and constructive neural network learning. Connection Science, 5(3 & 4):365-375. Foldi, N. S., Cicone, M., and Gardner, H. (1983). Pragmatic aspects of communication in brain damaged patients. In Segalowitz, S. J., editor, Language Functions and Brain Organization, pages 55-86. Academic Press, New York. Fowler, C. A., Best, C. T., and McRoberts, G. W. (1990). Young infants' perception of liquid coarticulatory influences on following stop consonants. Perception and Psychophysics, 48(6):559-570. Franzini, M. A., Witbrock, M. J., and Lee, K. (1989). Speaker-independent recognition of connected utterances using recurrent and non-recurrent neural networks. In Proceedings of the International Joint Conference on Neural Networks, Vol. 2, pages 1-6. IEEE. Frean, M. (1990). The Upstart algorithm: A method for constructing and training feedforward neural networks. Neural Computation, 2:198-209. Gallinari, P. (1995). Training of modular neural net systems. In Arbib, M., editor, The Handbook of Brain Theory and Neural Networks, pages 582-585. MIT Press. Garrett, M. (1994). The structure of language processing: Neuropsychological evidence. In Gazzaniga, M., editor, The Cognitive Neurosciences, pages 881-899. MIT press, Cambridge, MA. Ghada, N. A. and Mohamed, A. S. (2000). Evolution of recurrent cascade correlation networks with a distributed collaborative species. In The First IEEE Symposium on Combinations of Evolutionary Computation and Neural Networks, page To Appear, San Antonio, TX,. Giegerich, H. J. (1992). English phonology: an Introduction. University Press. Gimson, A. C. and Cruttenden, A. (1994). Gimson's Pronunciation of English. Hodder and Stoughton, fifth edition. Goldberg, D. (1989). Genetic Algorithms is Search, Optimization and Machine Learning. Addison Wesley, New York. Gordon, E. and Deverson, T. (1985). New Zealand English. Heinemann. Gordon, E. and Maclagan, M. (1990). A longitudinal study of the 'ear'/'air' contrast in New Zealand speech. In Bell, A. and Holmes, J., editors, New Zealand ways of speaking English, pages 129-148. Multilingual Matters, Bristol. Gordon, E. and Maclagan, M. (1995). The changing sound of new zealand english. The New Zealand Speech-Language Therapists' Journal, 50:32-40. Gordon, E. and MacLagan, M. A. (1983). A study of the /ia/ - /ea/ contrast in New Zealnd English. The New Zealand Speech Language Therapists' Journal, November:16-26. Gordon, E. and Trudgill, P. (1999). Shades of things to come: Embryonic varients in New Zealand English sound changes. English World Wide, 21(1):111-124. Gori, M., Bengio, Y., and De Mori, R. (1989). BPS: A learning algorithm for capturing the dynamic nature of speech. In Proceedings of the International Joint Conference on Neural Networks, Vol. 2, pages 417-423. IEEE. Haggo, D. (1984). Transcribing New Zealand English vowels. Te Reo, 27:63-67. Hampshire, J. B. and Waibel, A. (1990). Connectionist architectures for multi-speaker phoneme recognition. In Touretzky, D. E., editor, Advances in Neural Information Processing Systems 2, pages 203-210. Morgan Kaufmann. Handel, S. (1991). The physiology of listening. In Listening: An Introduction to the Perception of Auditory Events, pages 461-545. Bradford, second edition. Harris, F. J. (1978). On the use of windows for harmonic analysis with the discrete Fourier transform. Proceeding of the IEEE, 66:51-83. Hertz, J., Krough, A., and Palmer, R. G. (1991). Introduction to the Theory of Neural Computation. Addison-Wesley. Hieronymus, J. L. (1994). ASCII phonetic symbols for the world's languages: Worldbet. Technical report, AT&T. Hilario, M., Pellegrini, C., and Alexandre, F. (1994). Modular integration of connectionist and symbolic processing in knowledge-based systems. In Proc. ISIKNH'94: International Symposium on Integrating Knowledge and Neural Heuristics, pages 123-132, Pensacola, Florida. Holmes, J. (1995a). Three chairs for New Zealand English: the EAR/AIR merger. English Today, 11(3):14-18. Holmes, J. (1995b). Two for /t/: flapping and glottal stops in New Zealand English. Te Reo: Journal of the Linguistic Society of New Zealand, 38:53-72. Holmes, J. (1995c). The Wellington Corpus of Spoken New Zealand English: A progress report. New Zealand English Journal, 9:5-8. Holmes, J. (1996). The New Zealand spoken component of ICE: Some methodological challanges. In Greenbaum, S., editor, Comparing English Worldwide, pages 163-181. Oxford University Press, Oxford. Holmes, J. (1997). T-time in New Zealand. English Today, 13(3):18-22. Jankowski, N. and Kadirkamanathan, V. (1997b). Statistical control of growing and pruning in rbf-like neural networks. In In Third Conference on Neural Networks and Their Applications, pages 663-670, Kule, Poland. Jankowski, N. and Kadirkamanathan, V. (1997a). Statistical control of rbf-like networks for classification. In In 7th International Conference on Artificial Neural Networks, pages 385- 390, Lausanne, Switzerland. Springer-Verlag. Jensen, J. T. (1993). English Phonology. John Benjamin. Jordan, M. I. (1989). Serial order: A parallel, distributed processing approach. In Elman and Rumelhart, editors, Advances in Connectionist Theory: Speech. Erlbaum. Jusczyk, P. W., Friederici, A. D., Wessles, J. M. I., Svenkerud, V. Y., and Jusczyk, A. M. (1993). Infants' sensitivity to the sound patterns of native language words. Journal of Memory and Language, 32:402-420. Kadous, M. W. (1999). Learning comprehensible descriptions of multivariate time series. In Proc. 16th International Conf. on Machine Learning, pages 454-463. Kasabov, N. (1994). Towards using hybrid connectionist fuzzy production systems for speech recognition. In Proceedings of the WWW, Nagoya, Japan. Kasabov, N. (1996a). Adaptable connectionist productionist systems. Neurocomputing, 13:95- 117. Kasabov, N. (1996b). Foundations of Neural Networks, Fuzzy Systems and Knowledge Engineering. MIT Press, Cambridge, MA. Kasabov, N. (1996c). Learning and aproximate reasoning in fuzzy neural networks and hybrid systems. Fuzzy Sets and Systems, 82:135-149. Kasabov, N. (2000). Evolving fuzzy neural networks for supervised/unsupervised on-line, knowledge-based learning. IEEE Transactions on Man, Machine and Cybernetics; Part B: Cybernetics, To Appear. Kasabov, N., Kilgour, R., and Sinclair, S. (1999a). From hybrid adjustable neuro-fuzzy systems to adaptive connectionist-based systems for phoneme and word recognition. Fuzzy Sets and Systems, 103:349-367. Kasabov, N., Kim, J. S., Watts, M., and Gray, A. (1997). FuNN/2 - a fuzzy neural network architecture for adaptive learning and knowledge acquisition. Information Sciences - Applications, 101(3-4):155-175. Kasabov, N., Kozma, R., Kilgour, R., Laws, M., Watts, M., Gray, A., and Taylor, J. (1999b). Speech data analysis and recognition using fuzzy neural networks and self-organising maps. In Kasabov, N. and Kozma, R., editors, Neuro-Fuzzy Techniques for intelligent Information Systems, Studies in Fuzziness and Soft Computing, pages 241-263. Physica-Verlag, New York. Kasabov, N., Kozma, R., and Watts, M. (1998). Optimization and adaption of fuzzy neural networks through genetic algorithms and learning-with-forgetting methods and applications to phoneme based speech recognition. Information Sciences - Applications, 13. Kasabov, N., Sinclair, S., Kilgour, R., Watson, C., Laws, M., and Kasabova, D. (1995). Intelligent human to computer interfaces and the case study of building english-to-maori talking dictionary. In Kasabov, N. and Coghill, G., editors, Proceedings of ANNES '95, pages 294- 297. IEEE Computer Society Press. Kasabov, N., Watson, C., Sinclair, S., and Kilgour, R. (1994). Integrating neural networks and fuzzy systems for speech recognition. In Togneri, R., editor, Proceedings of the Fifth Australian International Conference on Speech Science and Technology, pages 462-467. Uniprint. Keegan, P. (1997). Kimikupu hou maori lexical database on the web: Reflections and possible future directions. In Proceedings NAMMSAT Conference, Massey University, Palmerston North. Kilgour, R. and Gray, A. (1997). A comparison of neural and statistical methods for phoneme recognition. In Proceeding of ICONIP (Addendum), pages 61-65. Kilgour, R. I. (1998). Hybrid Fuzzy Systems and Neural Networks for Speech Recognition. Masters thesis, University of Otago. Kilgour, R. I., Abdulla, W. H., and Kasabov, N. K. (1999). Using phoneme confusion and trigram matching to improve phoneme to grapheme transcription. In 6th International Conference on Neural Information Processing, page (Accepted), Perth. Kosko, B. (1992). Neural Networks and Fuzzy Systems. Prentice Hall. Kuhl, P. K. (1991). Human adults and human infants show a "perceptual magnet effect" for the prototypes of speech categories, monkeys do not. Perception and Psychophysics, 50(2):93- 107. Kuhl, P. K. (1994). Speech perception. In Ninifie, F., editor, Introduction to Communication Sciences and Disorders, pages 77-142. Singular Pub. Group, San Diago. Kuhn, R., Perronnin, F., Nguyen, P., Junqua, J., and Rigazio, L. (2001). Very fast adaptation with a compact context-dependent eigenvoice model. In ICASSP. Kwok, T. and Yeung, D. (1999). Constructive algorithms for structure learning in feedforward neural networks for regression problems. IEEE Transactions on Neural Networks, To Appear. Ladefoged, P. and Maddieson, I. (1996). The sounds of the world's languages. Blackwell, Cambridge, MA. Laws, M. (1997). Integrating text and speech into databases, information systems and knowledge engineering for human computer interaction. the progressive development of an integrated bilingual interface. In NAMMSAT '97, Massey University, Palmerston North. Laws, M. and Kilgour, R. (1998). MOOSE: Management of otago speech environment. In Proceedings of ICSLP '98, Sydney, Australia. Laws, M., Kilgour, R. I., and Watts, M. (2000). Analysis of the New Zealand English and Maori on-line translator. In Proceedings of JCIS 2000, volume 2, pages 848-851, Atlantic City, NJ. Laws, M. R. (1998). A Bilingual Speech Interface for New Zealand English to Maori. Masters thesis, University of Otago. Lee, S. J., Kim, C., Yoon, H., and Cho, J. W. (1991). Application of fully recurrent neural networks for speech recognition. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pages 77-80. IEEE. Lee, T. and Ching, P. C. (1996). On improving discrimination capability of an RNN based recognizer. In ICSLP 96, volume 1, pages 526-529, Philadelphia. Leerink, L. R. and Jabri, M. (1993). Improved phoneme recognition using multi-module recurrent neural networks. In Proceedings of the Fourth Australian Conference on Neural Networks, pages 26-29. Leung, H. C., Glass, J. R., Phillips, M. S., and Zue, V. W. (1990). Phonetic classification and recognition using the multi-layer perceptron. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pages 525-528. IEEE. Lewis, C. (1996). The Origins of New Zealand English: A report on work in progress. New Zealand English Journal, 10:25-30. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., and Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6):431-461. Maclagan, M. and Gordon, E. (1999). Data for New Zealand social dialectology: the Canterbury Corpus. New Zealand English Journal, 13:50-58. Maclagan, M. A. (1998). Diphthongisation of /e/ in NZE: a change that went nowhere? New Zealand English Journal, 12:43-54. Mahoney, J. and Mooney, R. (1994). Modifying network architectures for certainty-factor rulebase revision. In Proceedings of the International Symposium on Integrating Knowledge and Neural Heuristics, pages 75-85, Pensacola, FL. Mahoney, J. J. and Mooney, R. J. (1993). Combining connectionist and symbolic learning to redine certainty factor rule bases. Connection Science, 5(3 & 4):339-364. Marean, G. C., Werner, L. A., and Kuhl, P. K. (1992). Vowel categorization by very young infants. Developmental Psychology, 28(3):396-405. Masters, T. (1995). Advanced Algorithms for Neural Networks: A C++ Sourcebook. John Wiley and Sons, New York, NY. Miller, G. A. (1981). Language and Speech. W. H. Freeman. Miller, G. A. and Nicely, P. E. (1955). An analysis of perceptual confusions among some english consonants. Journal of the Acoustical Society of America, 27:338-46. Mitra, S. and Pal, S. K. (1992). Rule generation and inferencing with a layer
    corecore