881 research outputs found

    Neural Mechanisms for Information Compression by Multiple Alignment, Unification and Search

    Get PDF
    This article describes how an abstract framework for perception and cognition may be realised in terms of neural mechanisms and neural processing. This framework — called information compression by multiple alignment, unification and search (ICMAUS) — has been developed in previous research as a generalized model of any system for processing information, either natural or artificial. It has a range of applications including the analysis and production of natural language, unsupervised inductive learning, recognition of objects and patterns, probabilistic reasoning, and others. The proposals in this article may be seen as an extension and development of Hebb’s (1949) concept of a ‘cell assembly’. The article describes how the concept of ‘pattern’ in the ICMAUS framework may be mapped onto a version of the cell assembly concept and the way in which neural mechanisms may achieve the effect of ‘multiple alignment’ in the ICMAUS framework. By contrast with the Hebbian concept of a cell assembly, it is proposed here that any one neuron can belong in one assembly and only one assembly. A key feature of present proposals, which is not part of the Hebbian concept, is that any cell assembly may contain ‘references’ or ‘codes’ that serve to identify one or more other cell assemblies. This mechanism allows information to be stored in a compressed form, it provides a robust mechanism by which assemblies may be connected to form hierarchies and other kinds of structure, it means that assemblies can express abstract concepts, and it provides solutions to some of the other problems associated with cell assemblies. Drawing on insights derived from the ICMAUS framework, the article also describes how learning may be achieved with neural mechanisms. This concept of learning is significantly different from the Hebbian concept and appears to provide a better account of what we know about human learning

    Automatic differentiation in machine learning: a survey

    Get PDF
    Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

    An overview of artificial intelligence and robotics. Volume 1: Artificial intelligence. Part B: Applications

    Get PDF
    Artificial Intelligence (AI) is an emerging technology that has recently attracted considerable attention. Many applications are now under development. This report, Part B of a three part report on AI, presents overviews of the key application areas: Expert Systems, Computer Vision, Natural Language Processing, Speech Interfaces, and Problem Solving and Planning. The basic approaches to such systems, the state-of-the-art, existing systems and future trends and expectations are covered

    An exploration of the rhythm of Malay

    Get PDF
    In recent years there has been a surge of interest in speech rhythm. However we still lack a clear understanding of the nature of rhythm and rhythmic differences across languages. Various metrics have been proposed as means for measuring rhythm on the phonetic level and making typological comparisons between languages (Ramus et al, 1999; Grabe & Low, 2002; Dellwo, 2006) but the debate is ongoing on the extent to which these metrics capture the rhythmic basis of speech (Arvaniti, 2009; Fletcher, in press). Furthermore, cross linguistic studies of rhythm have covered a relatively small number of languages and research on previously unclassified languages is necessary to fully develop the typology of rhythm. This study examines the rhythmic features of Malay, for which, to date, relatively little work has been carried out on aspects rhythm and timing. The material for the analysis comprised 10 sentences produced by 20 speakers of standard Malay (10 males and 10 females). The recordings were first analysed using rhythm metrics proposed by Ramus et. al (1999) and Grabe & Low (2002). These metrics (∆C, %V, rPVI, nPVI) are based on durational measurements of vocalic and consonantal intervals. The results indicated that Malay clustered with other so-called syllable-timed languages like French and Spanish on the basis of all metrics. However, underlying the overall findings for these metrics there was a large degree of variability in values across speakers and sentences, with some speakers having values in the range typical of stressed-timed languages like English. Further analysis has been carried out in light of Fletcher’s (in press) argument that measurements based on duration do not wholly reflect speech rhythm as there are many other factors that can influence values of consonantal and vocalic intervals, and Arvaniti’s (2009) suggestion that other features of speech should also be considered in description of rhythm to discover what contributes to listeners’ perception of regularity. Spectrographic analysis of the Malay recordings brought to light two parameters that displayed consistency and regularity for all speakers and sentences: the duration of individual vowels and the duration of intervals between intensity minima. This poster presents the results of these investigations and points to connections between the features which seem to be consistently regulated in the timing of Malay connected speech and aspects of Malay phonology. The results are discussed in light of current debate on the descriptions of rhythm

    Image categorisation using parallel network constructs: an emulation of early human colour processing and context evaluation

    Get PDF
    PhD ThesisTraditional geometric scene analysis cannot attempt to address the understanding of human vision. Instead it adopts an algorithmic approach, concentrating on geometric model fitting. Human vision, however, is both quick and accurate but very little is known about how the recognition of objects is performed with such speed and efficiency. It is thought that there must be some process both for coding and storage which can account for these characteristics. In this thesis a more strict emulation of human vision, based on work derived from medical psychology and other fields, is proposed. Human beings must store perceptual information from which to make comparisons, derive structures and classify objects. It is widely thought by cognitive psychologists that some form of symbolic representation is inherent in this storage. Here a mathematical syntax is defined to perform this kind of symbolic description. The symbolic structures must be capable of manipulation and a set of operators is defined for this purpose. The early visual cortex and geniculate body are both inherently parallel in operation and simple in structure. A broadly connectionist emulation of this kind of structure is described, using independent computing elements, which can perform segmentation, re-colouring and generation of the base elements of the description syntax. Primal colour information is then collected by a second network which forms the visual topology, colouring and position information of areas in the image as well as a full description of the scene in terms of a more complex symbolic set. The idea of different visual contexts is introduced and a model is proposed for the accumulation of context rules. This model is then applied to a database of natural images.EPSRC CASE award: Neural Computer Sciences,Southampton

    Sparse machine learning methods with applications in multivariate signal processing

    Get PDF
    This thesis details theoretical and empirical work that draws from two main subject areas: Machine Learning (ML) and Digital Signal Processing (DSP). A unified general framework is given for the application of sparse machine learning methods to multivariate signal processing. In particular, methods that enforce sparsity will be employed for reasons of computational efficiency, regularisation, and compressibility. The methods presented can be seen as modular building blocks that can be applied to a variety of applications. Application specific prior knowledge can be used in various ways, resulting in a flexible and powerful set of tools. The motivation for the methods is to be able to learn and generalise from a set of multivariate signals. In addition to testing on benchmark datasets, a series of empirical evaluations on real world datasets were carried out. These included: the classification of musical genre from polyphonic audio files; a study of how the sampling rate in a digital radar can be reduced through the use of Compressed Sensing (CS); analysis of human perception of different modulations of musical key from Electroencephalography (EEG) recordings; classification of genre of musical pieces to which a listener is attending from Magnetoencephalography (MEG) brain recordings. These applications demonstrate the efficacy of the framework and highlight interesting directions of future research

    Automatic facial recognition based on facial feature analysis

    Get PDF

    Random Projection in Deep Neural Networks

    Get PDF
    This work investigates the ways in which deep learning methods can benefit from random projection (RP), a classic linear dimensionality reduction method. We focus on two areas where, as we have found, employing RP techniques can improve deep models: training neural networks on high-dimensional data and initialization of network parameters. Training deep neural networks (DNNs) on sparse, high-dimensional data with no exploitable structure implies a network architecture with an input layer that has a huge number of weights, which often makes training infeasible. We show that this problem can be solved by prepending the network with an input layer whose weights are initialized with an RP matrix. We propose several modifications to the network architecture and training regime that makes it possible to efficiently train DNNs with learnable RP layer on data with as many as tens of millions of input features and training examples. In comparison to the state-of-the-art methods, neural networks with RP layer achieve competitive performance or improve the results on several extremely high-dimensional real-world datasets. The second area where the application of RP techniques can be beneficial for training deep models is weight initialization. Setting the initial weights in DNNs to elements of various RP matrices enabled us to train residual deep networks to higher levels of performance

    Pre-processing, classification and semantic querying of large-scale Earth observation spaceborne/airborne/terrestrial image databases: Process and product innovations.

    Get PDF
    By definition of Wikipedia, “big data is the term adopted for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The big data challenges typically include capture, curation, storage, search, sharing, transfer, analysis and visualization”. Proposed by the intergovernmental Group on Earth Observations (GEO), the visionary goal of the Global Earth Observation System of Systems (GEOSS) implementation plan for years 2005-2015 is systematic transformation of multisource Earth Observation (EO) “big data” into timely, comprehensive and operational EO value-adding products and services, submitted to the GEO Quality Assurance Framework for Earth Observation (QA4EO) calibration/validation (Cal/Val) requirements. To date the GEOSS mission cannot be considered fulfilled by the remote sensing (RS) community. This is tantamount to saying that past and existing EO image understanding systems (EO-IUSs) have been outpaced by the rate of collection of EO sensory big data, whose quality and quantity are ever-increasing. This true-fact is supported by several observations. For example, no European Space Agency (ESA) EO Level 2 product has ever been systematically generated at the ground segment. By definition, an ESA EO Level 2 product comprises a single-date multi-spectral (MS) image radiometrically calibrated into surface reflectance (SURF) values corrected for geometric, atmospheric, adjacency and topographic effects, stacked with its data-derived scene classification map (SCM), whose thematic legend is general-purpose, user- and application-independent and includes quality layers, such as cloud and cloud-shadow. Since no GEOSS exists to date, present EO content-based image retrieval (CBIR) systems lack EO image understanding capabilities. Hence, no semantic CBIR (SCBIR) system exists to date either, where semantic querying is synonym of semantics-enabled knowledge/information discovery in multi-source big image databases. In set theory, if set A is a strict superset of (or strictly includes) set B, then A B. This doctoral project moved from the working hypothesis that SCBIR computer vision (CV), where vision is synonym of scene-from-image reconstruction and understanding EO image understanding (EO-IU) in operating mode, synonym of GEOSS ESA EO Level 2 product human vision. Meaning that necessary not sufficient pre-condition for SCBIR is CV in operating mode, this working hypothesis has two corollaries. First, human visual perception, encompassing well-known visual illusions such as Mach bands illusion, acts as lower bound of CV within the multi-disciplinary domain of cognitive science, i.e., CV is conditioned to include a computational model of human vision. Second, a necessary not sufficient pre-condition for a yet-unfulfilled GEOSS development is systematic generation at the ground segment of ESA EO Level 2 product. Starting from this working hypothesis the overarching goal of this doctoral project was to contribute in research and technical development (R&D) toward filling an analytic and pragmatic information gap from EO big sensory data to EO value-adding information products and services. This R&D objective was conceived to be twofold. First, to develop an original EO-IUS in operating mode, synonym of GEOSS, capable of systematic ESA EO Level 2 product generation from multi-source EO imagery. EO imaging sources vary in terms of: (i) platform, either spaceborne, airborne or terrestrial, (ii) imaging sensor, either: (a) optical, encompassing radiometrically calibrated or uncalibrated images, panchromatic or color images, either true- or false color red-green-blue (RGB), multi-spectral (MS), super-spectral (SS) or hyper-spectral (HS) images, featuring spatial resolution from low (> 1km) to very high (< 1m), or (b) synthetic aperture radar (SAR), specifically, bi-temporal RGB SAR imagery. The second R&D objective was to design and develop a prototypical implementation of an integrated closed-loop EO-IU for semantic querying (EO-IU4SQ) system as a GEOSS proof-of-concept in support of SCBIR. The proposed closed-loop EO-IU4SQ system prototype consists of two subsystems for incremental learning. A primary (dominant, necessary not sufficient) hybrid (combined deductive/top-down/physical model-based and inductive/bottom-up/statistical model-based) feedback EO-IU subsystem in operating mode requires no human-machine interaction to automatically transform in linear time a single-date MS image into an ESA EO Level 2 product as initial condition. A secondary (dependent) hybrid feedback EO Semantic Querying (EO-SQ) subsystem is provided with a graphic user interface (GUI) to streamline human-machine interaction in support of spatiotemporal EO big data analytics and SCBIR operations. EO information products generated as output by the closed-loop EO-IU4SQ system monotonically increase their value-added with closed-loop iterations
    corecore