404 research outputs found

    Relational Data Mining Through Extraction of Representative Exemplars

    Full text link
    With the growing interest on Network Analysis, Relational Data Mining is becoming an emphasized domain of Data Mining. This paper addresses the problem of extracting representative elements from a relational dataset. After defining the notion of degree of representativeness, computed using the Borda aggregation procedure, we present the extraction of exemplars which are the representative elements of the dataset. We use these concepts to build a network on the dataset. We expose the main properties of these notions and we propose two typical applications of our framework. The first application consists in resuming and structuring a set of binary images and the second in mining co-authoring relation in a research team

    A Model of the Network Architecture of the Brain that Supports Natural Language Processing

    Get PDF
    For centuries, neuroscience has proposed models of the neurobiology of language processing that are static and localised to few temporal and inferior frontal regions. Although existing models have offered some insight into the processes underlying lower-level language features, they have largely overlooked how language operates in the real world. Here, we aimed at investigating the network organisation of the brain and how it supports language processing in a naturalistic setting. We hypothesised that the brain is organised in a multiple core-periphery and dynamic modular architecture, with canonical language regions forming high-connectivity hubs. Moreover, we predicted that language processing would be distributed to much of the rest of the brain, allowing it to perform more complex tasks and to share information with other cognitive domains. To test these hypotheses, we collected the Naturalistic Neuroimaging Database of people watching full length movies during functional magnetic resonance imaging. We computed network algorithms to capture the voxel-wise architecture of the brain in individual participants and inspected variations in activity distribution over different stimuli and over more complex language features. Our results confirmed the hypothesis that the brain is organised in a flexible multiple core-periphery architecture with large dynamic communities. Here, language processing was distributed to much of the rest of the brain, together forming multiple communities. Canonical language regions constituted hubs, explaining why they consistently appear in various other neurobiology of language models. Moreover, language processing was supported by other regions such as visual cortex and episodic memory regions, when processing more complex context-specific language features. Overall, our flexible and distributed model of language comprehension and the brain points to additional brain regions and pathways that could be exploited for novel and more individualised therapies for patients suffering from speech impairments

    Representing 3D shape in sparse range images for urban object classification

    Get PDF
    This thesis develops techniques for interpreting 3D range images acquired in outdoor environments at a low resolution. It focuses on the task of robustly capturing the shapes that comprise objects, in order to classify them. With the recent development of 3D sensors such as the Velodyne, it is now possible to capture range images at video frame rates, allowing mobile robots to observe dynamic scenes in 3D. To classify objects in these scenes, features are extracted from the data, which allows different regions to be matched. However, range images acquired at this speed are of low resolution, and there are often significant changes in sensor viewpoint and occlusion. In this context, existing methods for feature extraction do not perform well. This thesis contributes algorithms for the robust abstraction from 3D points to object classes. Efficient region-of-interest and surface normal extraction are evaluated, resulting in a keypoint algorithm that provides stable orientations. These build towards a novel feature, called the ‘line image,’ that is designed to consistently capture local shape, regardless of sensor viewpoint. It does this by explicitly reasoning about the difference between known empty space, and space that has not been measured due to occlusion or sparse sensing. A dataset of urban objects scanned with a Velodyne was collected and hand labelled, in order to compare this feature with several others on the task of classification. First, a simple k-nearest neighbours approach was used, where the line image showed improvements. Second, more complex classifiers were applied, requiring the features to be clustered. The clusters were used in topic modelling, allowing specific sub-parts of objects to be learnt across multiple scales, improving accuracy by 10%. This work is applicable to any range image data. In general, it demonstrates the advantages in using the inherent density and occupancy information in a range image during 3D point cloud processing

    Representing 3D shape in sparse range images for urban object classification

    Get PDF
    This thesis develops techniques for interpreting 3D range images acquired in outdoor environments at a low resolution. It focuses on the task of robustly capturing the shapes that comprise objects, in order to classify them. With the recent development of 3D sensors such as the Velodyne, it is now possible to capture range images at video frame rates, allowing mobile robots to observe dynamic scenes in 3D. To classify objects in these scenes, features are extracted from the data, which allows different regions to be matched. However, range images acquired at this speed are of low resolution, and there are often significant changes in sensor viewpoint and occlusion. In this context, existing methods for feature extraction do not perform well. This thesis contributes algorithms for the robust abstraction from 3D points to object classes. Efficient region-of-interest and surface normal extraction are evaluated, resulting in a keypoint algorithm that provides stable orientations. These build towards a novel feature, called the ‘line image,’ that is designed to consistently capture local shape, regardless of sensor viewpoint. It does this by explicitly reasoning about the difference between known empty space, and space that has not been measured due to occlusion or sparse sensing. A dataset of urban objects scanned with a Velodyne was collected and hand labelled, in order to compare this feature with several others on the task of classification. First, a simple k-nearest neighbours approach was used, where the line image showed improvements. Second, more complex classifiers were applied, requiring the features to be clustered. The clusters were used in topic modelling, allowing specific sub-parts of objects to be learnt across multiple scales, improving accuracy by 10%. This work is applicable to any range image data. In general, it demonstrates the advantages in using the inherent density and occupancy information in a range image during 3D point cloud processing

    Optimizing Data Stream Representation: An Extensive Survey on Stream Clustering Algorithms

    Get PDF
    Abstract Analyzing data streams has received considerable attention over the past decades due to the widespread usage of sensors, social media and other streaming data sources. A core research area in this field is stream clustering which aims to recognize patterns in an unordered, infinite and evolving stream of observations. Clustering can be a crucial support in decision making, since it aims for an optimized aggregated representation of a continuous data stream over time and allows to identify patterns in large and high-dimensional data. A multitude of algorithms and approaches has been developed that are able to find and maintain clusters over time in the challenging streaming scenario. This survey explores, summarizes and categorizes a total of 51 stream clustering algorithms and identifies core research threads over the past decades. In particular, it identifies categories of algorithms based on distance thresholds, density grids and statistical models as well as algorithms for high dimensional data. Furthermore, it discusses applications scenarios, available software and how to configure stream clustering algorithms. This survey is considerably more extensive than comparable studies, more up-to-date and highlights how concepts are interrelated and have been developed over time

    Proceedings of Abstracts, School of Physics, Engineering and Computer Science Research Conference 2022

    Get PDF
    © 2022 The Author(s). This is an open-access work distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. For further details please see https://creativecommons.org/licenses/by/4.0/. Plenary by Prof. Timothy Foat, ‘Indoor dispersion at Dstl and its recent application to COVID-19 transmission’ is © Crown copyright (2022), Dstl. This material is licensed under the terms of the Open Government Licence except where otherwise stated. To view this licence, visit http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3 or write to the Information Policy Team, The National Archives, Kew, London TW9 4DU, or email: [email protected] present proceedings record the abstracts submitted and accepted for presentation at SPECS 2022, the second edition of the School of Physics, Engineering and Computer Science Research Conference that took place online, the 12th April 2022

    The application of time encoded signals to automated machine condition classification using neural networks

    Get PDF
    This thesis considers the classification of physical states in a simplified gearbox using acoustical data and simple time domain signal shape characterisation techniques allied to a basic feedforward multi-layer perceptron neural network. A novel extension to the signal coding scheme (TES), involving the application of energy based shape descriptors, was developed. This sought specifically to improve the techniques suitability to the identification of mechanical states and was evaluated against the more traditional minima based TES descriptors. The application of learning based identification techniques offers potential advantages over more traditional programmed techniques both in terms of greater noise immunity and in the reduced requirement for highly skilled operators. The practical advantages accrued by using these networks are studied together with some of the problems associated in their use within safety critical monitoring systems.Practical trials were used as a means of developing the TES conversion mechanism and were used to evaluate the requirements of the neural networks being used to classify the data. These assessed the effects upon performance of the acquisition and digital signal processing phases as well as the subsequent training requirements of networks used for accurate condition classification. Both random data selection and more operator intensive performance based selection processes were evaluated for training. Some rudimentary studies were performed on the internal architectural configuration of the neural networks in order to quantify its influence on the classification process, specifically its effect upon fault resolution enhancement.The techniques have proved to be successful in separating several unique physical states without the necessity for complex state definitions to be identified in advance. Both the computational demands and the practical constraints arising from the use of these techniques fall within the bounds of a realisable system

    Local movement: agent-based models of pedestrian flows

    Get PDF
    Modelling movement within the built environment has hitherto been focused on rather coarse spatial scales where the emphasis has been upon simulating flows of traffic between origins and destinations. Models of pedestrian movement have been sporadic, based largely on finding statistical relationships between volumes and the accessibility of streets, with no sustained efforts at improving such theories. The development of object-orientated computing and agent-based models which have followed in this wake, promise to change this picture radically. It is now possible to develop models simulating the geometric motion of individual agents in small-scale environments using theories of traffic flow to underpin their logic. In this paper, we outline such a model which we adapt to simulate flows of pedestrians between fixed points of entry - gateways - into complex environments such as city centres, and points of attraction based on the location of retail and leisure facilities which represent the focus of such movements. The model simulates the movement of each individual in terms of five components; these are based on motion in the direction of the most attractive locations, forward movement, the avoidance of local geometric obstacles, thresholds which constrain congestion, and movement which is influenced by those already moving towards various locations. The model has elements which enable walkers to self-organise as well as learn from their geometric experiences so far. We first outline the structure of the model, present a computable form, and illustrate how it can be programmed as a variant of cellular automata. We illustrate it using three examples: its application to an idealised mall where we show how two key components - local navigation of obstacles and movement towards points of global locational attraction - can be parameterised, an application to the more complex town centre of Wolverhampton (in the UK West Midlands) where the paths of individual walkers are used to explore the veracity of the model, and finally it application to the Tate Gallery complex in central London where the focus is on calibrating the model by letting individual agents learn from their experience of walking within the environment

    Detection and prediction of urban archetypes at the pedestrian scale: computational toolsets, morphological metrics, and machine learning methods

    Get PDF
    Granular, dense, and mixed-use urban morphologies are hallmarks of walkable and vibrant streets. However, urban systems are notoriously complex and planned urban development, which grapples with varied interdependent and oft conflicting criteria, may — despite best intentions — yield aberrant morphologies fundamentally at odds with the needs of pedestrians and the resiliency of neighbourhoods. This work addresses the measurement, detection, and prediction of pedestrian-friendly urban archetypes by developing techniques for high-resolution urban analytics at the pedestrian scale. A spatial-analytic computational toolset, the cityseer-api Python package, is created to assess localised centrality, land-use, and statistical metrics using contextually sensitive workflows applied directly over the street network. cityseer-api subsequently facilitates a review of mixed-use and street network centrality methods to improve their utility concerning granular urban analysis. Unsupervised machine learning methods are applied to recover ‘signatures’ — urban archetypes — using Principal Component Analysis, Variational Autoencoders, and clustering methods from a high-resolution multi-variable and multi-scalar dataset consisting of centralities, land-uses, and population densities for Greater London. Supervised deep-learning methods applied to a similar dataset developed for 931 towns and cities in Great Britain demonstrate how, with the aid of domain knowledge, machine-learning classifiers can learn to discriminate between ‘artificial’ and ‘historical’ urban archetypes. These methods use complex systems thinking as a departure point and illustrate how high-resolution spatial-analytic quantitative methods can be combined with machine learning to extrapolate benchmarks in keeping with more qualitatively framed urban morphological conceptions. Such tools may aid urban design professionals in better anticipating the outcomes of varied design scenarios as part of iterative and scalable workflows. These techniques may likewise provide robust and demonstrable feedback as part of planning review and approvals processes
    • …
    corecore