2,590 research outputs found

    SCREEN: Learning a Flat Syntactic and Semantic Spoken Language Analysis Using Artificial Neural Networks

    Get PDF
    In this paper, we describe a so-called screening approach for learning robust processing of spontaneously spoken language. A screening approach is a flat analysis which uses shallow sequences of category representations for analyzing an utterance at various syntactic, semantic and dialog levels. Rather than using a deeply structured symbolic analysis, we use a flat connectionist analysis. This screening approach aims at supporting speech and language processing by using (1) data-driven learning and (2) robustness of connectionist networks. In order to test this approach, we have developed the SCREEN system which is based on this new robust, learned and flat analysis. In this paper, we focus on a detailed description of SCREEN's architecture, the flat syntactic and semantic analysis, the interaction with a speech recognizer, and a detailed evaluation analysis of the robustness under the influence of noisy or incomplete input. The main result of this paper is that flat representations allow more robust processing of spontaneous spoken language than deeply structured representations. In particular, we show how the fault-tolerance and learning capability of connectionist networks can support a flat analysis for providing more robust spoken-language processing within an overall hybrid symbolic/connectionist framework.Comment: 51 pages, Postscript. To be published in Journal of Artificial Intelligence Research 6(1), 199

    Recent Trends in Computational Intelligence

    Get PDF
    Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Integration of Action and Language Knowledge: A Roadmap for Developmental Robotics

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”This position paper proposes that the study of embodied cognitive agents, such as humanoid robots, can advance our understanding of the cognitive development of complex sensorimotor, linguistic, and social learning skills. This in turn will benefit the design of cognitive robots capable of learning to handle and manipulate objects and tools autonomously, to cooperate and communicate with other robots and humans, and to adapt their abilities to changing internal, environmental, and social conditions. Four key areas of research challenges are discussed, specifically for the issues related to the understanding of: 1) how agents learn and represent compositional actions; 2) how agents learn and represent compositional lexica; 3) the dynamics of social interaction and learning; and 4) how compositional action and language representations are integrated to bootstrap the cognitive system. The review of specific issues and progress in these areas is then translated into a practical roadmap based on a series of milestones. These milestones provide a possible set of cognitive robotics goals and test scenarios, thus acting as a research roadmap for future work on cognitive developmental robotics.Peer reviewe

    Precis of neuroconstructivism: how the brain constructs cognition

    Get PDF
    Neuroconstructivism: How the Brain Constructs Cognition proposes a unifying framework for the study of cognitive development that brings together (1) constructivism (which views development as the progressive elaboration of increasingly complex structures), (2) cognitive neuroscience (which aims to understand the neural mechanisms underlying behavior), and (3) computational modeling (which proposes formal and explicit specifications of information processing). The guiding principle of our approach is context dependence, within and (in contrast to Marr [1982]) between levels of organization. We propose that three mechanisms guide the emergence of representations: competition, cooperation, and chronotopy; which themselves allow for two central processes: proactivity and progressive specialization. We suggest that the main outcome of development is partial representations, distributed across distinct functional circuits. This framework is derived by examining development at the level of single neurons, brain systems, and whole organisms. We use the terms encellment, embrainment, and embodiment to describe the higher-level contextual influences that act at each of these levels of organization. To illustrate these mechanisms in operation we provide case studies in early visual perception, infant habituation, phonological development, and object representations in infancy. Three further case studies are concerned with interactions between levels of explanation: social development, atypical development and within that, developmental dyslexia. We conclude that cognitive development arises from a dynamic, contextual change in embodied neural structures leading to partial representations across multiple brain regions and timescales, in response to proactively specified physical and social environment

    A Survey of Available Corpora For Building Data-Driven Dialogue Systems: The Journal Version

    Get PDF
    During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and quite promising. To facilitate research in this area, we have carried out a wide survey of publicly available datasets suitable for data-driven learning of dialogue systems. We discuss important characteristics of these datasets, how they can be used to learn diverse dialogue strategies, and their other potential uses. We also examine methods for transfer learning between datasets and the use of external knowledge. Finally, we discuss appropriate choice of evaluation metrics for the learning objective

    Towards an automatic speech recognition system for use by deaf students in lectures

    Get PDF
    According to the Royal National Institute for Deaf people there are nearly 7.5 million hearing-impaired people in Great Britain. Human-operated machine transcription systems, such as Palantype, achieve low word error rates in real-time. The disadvantage is that they are very expensive to use because of the difficulty in training operators, making them impractical for everyday use in higher education. Existing automatic speech recognition systems also achieve low word error rates, the disadvantages being that they work for read speech in a restricted domain. Moving a system to a new domain requires a large amount of relevant data, for training acoustic and language models. The adopted solution makes use of an existing continuous speech phoneme recognition system as a front-end to a word recognition sub-system. The subsystem generates a lattice of word hypotheses using dynamic programming with robust parameter estimation obtained using evolutionary programming. Sentence hypotheses are obtained by parsing the word lattice using a beam search and contributing knowledge consisting of anti-grammar rules, that check the syntactic incorrectness’ of word sequences, and word frequency information. On an unseen spontaneous lecture taken from the Lund Corpus and using a dictionary containing "2637 words, the system achieved 815% words correct with 15% simulated phoneme error, and 73.1% words correct with 25% simulated phoneme error. The system was also evaluated on 113 Wall Street Journal sentences. The achievements of the work are a domain independent method, using the anti- grammar, to reduce the word lattice search space whilst allowing normal spontaneous English to be spoken; a system designed to allow integration with new sources of knowledge, such as semantics or prosody, providing a test-bench for determining the impact of different knowledge upon word lattice parsing without the need for the underlying speech recognition hardware; the robustness of the word lattice generation using parameters that withstand changes in vocabulary and domain
    corecore