3,545 research outputs found

    Finding Outliers in Satellite Patterns by Learning Pattern Identities

    Get PDF
    Spacecrafts provide a large set of on-board components information such as their temperature, power and pressure. This information is constantly monitored by engineers, who capture the outliers and determine whether the situation is abnormal or not. However, due to the large quantity of information, only a small part of the data is being processed or used to perform anomaly prediction. A common accepted research concept for anomaly prediction as described in literature yields on using projections, based on probabilities, estimated on learned patterns from the past (Fujimaki et al., 2005) and data mining methods to enhance the conventional diagnosis approach (Li et al., 2010). Most of them conclude on the need to build a status vector. We propose an algorithm for efficient outlier detection that builds an identity chart of the patterns using the past data based on their curve fitting information. It detects the functional units of the patterns without apriori knowledge with the intent to learn its structure and to reconstruct the sequence of events described by the signal. On top of statistical elements, each pattern is allotted a characteristics chart. This pattern identity enables fast pattern matching across the data. The extracted features allow classification with regular clustering methods like support vector machines (SVM). The algorithm has been tested and evaluated using real satellite telemetry data. The outcome and performance show promising results for faster anomaly prediction

    AI perceives like a local:predicting citizen deprivation perception using satellite imagery

    Get PDF
    Deprived urban areas, commonly referred to as ‘slums,’ are the consequence of unprecedented urbanisation. Previous studies have highlighted the potential of Artificial Intelligence (AI) and Earth Observation (EO) in capturing physical aspects of urban deprivation. However, little research has explored AI’s ability to predict how locals perceive deprivation. This research aims to develop a method to predict citizens’ perception of deprivation using satellite imagery, citizen science, and AI. A deprivation perception score was computed from slum-citizens’ votes. Then, AI was used to model this score, and results indicate that it can effectively predict perception, with deep learning outperforming conventional machine learning. By leveraging AI and EO, policymakers can comprehend the underlying patterns of urban deprivation, enabling targeted interventions based on citizens’ needs. As over a quarter of the global urban population resides in slums, this tool can help prioritise citizens’ requirements, providing evidence for implementing urban upgrading policies aligned with SDG-11.</p

    On computational models of animal movement behaviour

    Get PDF
    Finding structures and patterns in animal movement data is essential towards understanding a variety of behavioural phenomena, as well as shedding light into the relationships between animals among conspecifics and across different taxa with respect to their environments. The recent advances in the field of computational intelligence coupled with the proliferation of low-cost telemetry devices have made the gathering and analyses of behavioural data of animals in their natural habitat and in a wide range of context possible with aid of devices such as Global Positioning System (GPS). The sensory input that animals receive from their environment, and the corresponding motor output, as well as the neural basis of this relationship most especially as it affects movement, encode a lot of information regarding the welfare and survival of these animals and other organisms in nature's ecosystem. This has huge implications in the area of biodiversity monitoring, global health and understanding disease progression. Encoding, decoding and quantifying these functional relationships however can be challenging, boring and labour intensive. Artificial intelligence holds promise in solving some of these problems and even stand to benefit as understanding natural intelligence for instance can aid in the advancement of artificial intelligence. In this thesis, I investigate and propose several computational methods leveraging information theoretic metrics and also modern machine learning methods including supervised, unsupervised and a novel combination of both towards understanding, predicting, forecasting and quantifying a variety of animal movement phenomena at different time scales across different taxa and species. Most importantly the models proposed in this thesis tackle important problems bordering on human and animal welfare as well as their intersection. Crucially, I investigate several information theoretic metrics towards mining animal movement data, after which I propose machine learning and statistical techniques for automatically quantifying abnormal movement behaviour in sheep with Batten disease using unsupervised methods. In addition, I propose a predictive model capable of forecasting migration patterns in Turkey vulture as well as their stop-over decisions using bidirectional recurrent neural networks. And finally, I propose a model of sheep movement behaviour in a flock leveraging insights in cognitive neuroscience with modern deep learning models. Overall, the models of animal movement behaviour developed in this thesis are useful to a wide range of scientists in the field of neuroscience, ethology, veterinary science, conservation and public health. Although these models have been designed for understanding and predicting animal movement behaviour, in a lot of cases they scale easily into other domains such as human behaviour modelling with little modifications. I highlight the importance of continuous research in developing computational models of animal movement behaviour towards improving our understanding of nature in relation to the interaction between animals and their environments

    Artificial Intelligence Based Classification for Urban Surface Water Modelling

    Get PDF
    Estimations and predictions of surface water runoff can provide very useful insights, regarding flood risks in urban areas. To automatically predict the flow behaviour of the rainfall-runoff water, in real-world satellite images, it is important to precisely identify permeable and impermeable areas. This identification indicates and helps to calculate the amount of surface water, by taking into account the amount of water being absorbed in a permeable area and what remains on the impermeable area. In this research, a model of surface water has been established, to predict the behavioural flow of rainfall-runoff water. This study employs a combination of image processing, artificial intelligence and machine learning techniques, for automatic segmentation and classification of permeable and impermeable areas, in satellite images. These techniques investigate the image classification approaches for classifying three land-use categories (roofs, roads, and pervious areas), commonly found in satellite images of the earth’s surface. Three different classification scenarios are investigated, to select the best classification model. The first scenario involves pixel by pixel classification of images, using Classification Tree and Random Forest classification techniques, in 2 different settings of sequential and parallel execution of algorithms. In the second classification scenario, the image is divided into objects, by using Superpixels (SLIC) segmentation method, while three kinds of feature sets are extracted from the segmented objects. The performance of eight different supervised machine learning classifiers is probed, using 5-fold cross-validation, for multiple SLIC values, while detailed performance comparisons lead to conclusions about the classification into different classes, regarding Object-based and Pixel-based classification schemes. Pareto analysis and Knee point selection are used to select SLIC value and the suitable type of classification, among the aforementioned two. Furthermore, a new diversity and weighted sum-based ensemble classification model, called ParetoEnsemble, is proposed, in this classification scenario. The weights are applied to selected component classifiers of an ensemble, creating a strong classifier, where classification is done based on multiple votes from candidate classifiers of the ensemble, as opposed to individual classifiers, where classification is done based on a single vote, from only one classifier. Unbalanced and balanced data-based classification results are also evaluated, to determine the most suitable mode, for satellite image classifications, in this study. Convolutional Neural Networks, based on semantic segmentation, are also employed in the classification phase, as a third scenario, to evaluate the strength of deep learning model SegNet, in the classification of satellite imaging. The best results, from the three classification scenarios, are compared and the best classification method, among the three scenarios, is used in the next phase of water modelling, with the InfoWorks ICM software, to explore the potential of modelling process, regarding a partially automated surface water network. By using the parameter settings, with a specified amount of simulated rain falling, onto the imaged area, the amount of surface water flow is estimated, to get predictions about runoff situations in urban areas, since runoff, in such a situation, can be high enough to pose a dangerous flood risk. The area of Feock, in Cornwall, is used as a simulation area of study, in this research, where some promising results have been derived, regarding classification and modelling of runoff. The correlation coefficient estimation, between classification and runoff accuracy, provides useful insight, regarding the dependence of runoff performance on classification performance. The trained system was tested on some unknown area images as well, demonstrating a reasonable performance, considering the training and classification limitations and conditions. Furthermore, in these unknown area images, reasonable estimations were derived, regarding surface water runoff. An analysis of unbalanced and balanced data-based classification and runoff estimations, for multiple parameter configurations, provides aid to the selection of classification and modelling parameter values, to be used in future unknown data predictions. This research is founded on the incorporation of satellite imaging into water modelling, using selective images for analysis and assessment of results. This system can be further improved, and runoff predictions of high precision can be better achieved, by adding more high-resolution images to the classifiers training. The added variety, to the trained model, can lead to an even better classification of any unknown image, which could eventually provide better modelling and better insights into surface water modelling. Moreover, the modelling phase can be extended, in future research, to deal with real-time parameters, by calibrating the model, after the classification phase, in order to observe the impact of classification on the actual calibration

    Automatic human behaviour anomaly detection in surveillance video

    Get PDF
    This thesis work focusses upon developing the capability to automatically evaluate and detect anomalies in human behaviour from surveillance video. We work with static monocular cameras in crowded urban surveillance scenarios, particularly air- ports and commercial shopping areas. Typically a person is 100 to 200 pixels high in a scene ranging from 10 - 20 meters width and depth, populated by 5 to 40 peo- ple at any given time. Our procedure evaluates human behaviour unobtrusively to determine outlying behavioural events, agging abnormal events to the operator. In order to achieve automatic human behaviour anomaly detection we address the challenge of interpreting behaviour within the context of the social and physical environment. We develop and evaluate a process for measuring social connectivity between individuals in a scene using motion and visual attention features. To do this we use mutual information and Euclidean distance to build a social similarity matrix which encodes the social connection strength between any two individuals. We de- velop a second contextual basis which acts by segmenting a surveillance environment into behaviourally homogeneous subregions which represent high tra c slow regions and queuing areas. We model the heterogeneous scene in homogeneous subgroups using both contextual elements. We bring the social contextual information, the scene context, the motion, and visual attention features together to demonstrate a novel human behaviour anomaly detection process which nds outlier behaviour from a short sequence of video. The method, Nearest Neighbour Ranked Outlier Clusters (NN-RCO), is based upon modelling behaviour as a time independent se- quence of behaviour events, can be trained in advance or set upon a single sequence. We nd that in a crowded scene the application of Mutual Information-based social context permits the ability to prevent self-justifying groups and propagate anomalies in a social network, granting a greater anomaly detection capability. Scene context uniformly improves the detection of anomalies in all the datasets we test upon. We additionally demonstrate that our work is applicable to other data domains. We demonstrate upon the Automatic Identi cation Signal data in the maritime domain. Our work is capable of identifying abnormal shipping behaviour using joint motion dependency as analogous for social connectivity, and similarly segmenting the shipping environment into homogeneous regions

    Tools and algorithms to advance interactive intrusion analysis via Machine Learning and Information Retrieval

    Get PDF
    We consider typical tasks that arise in the intrusion analysis of log data from the perspectives of Machine Learning and Information Retrieval, and we study a number of data organization and interactive learning techniques to improve the analyst\u27s efficiency. In doing so, we attempt to translate intrusion analysis problems into the language of the abovementioned disciplines and to offer metrics to evaluate the effect of proposed techniques. The Kerf toolkit contains prototype implementations of these techniques, as well as data transformation tools that help bridge the gap between the real world log data formats and the ML and IR data models. We also describe the log representation approach that Kerf prototype tools are based on. In particular, we describe the connection between decision trees, automatic classification algorithms and log analysis techniques implemented in Kerf

    Incorporating plant community structure in species distribution modelling: a species co-occurrence based composite approach

    Get PDF
    Species distribution models (SDM) with remotely sensed (RS) imagery is widely used in ecological studies and conservation planning, and the performance is frequently limited by factors including small plant size, small numbers of observations, and scattered distribution patterns. The focus of my thesis was to develop and evaluate alternative SDM methodologies to deal with such challenges. I used a record of nine endemic species occurrences from the Athabasca Sand Dunes in northern Saskatchewan to assess five different modelling algorithms including modern regression and machine learning techniques to understand how species distribution characteristics influence model prediction accuracies. All modelling algorithms showed robust performance (>0.5 AUC), with the best performance in most cases from generalized linear models (GLM). The threshold selection for presence-absence analysis highlights that actively selecting the optimum level is the best approach compared to the standard high threshold approach as with the latter there is a potential to deliver inconsistent predictions compared to observed patterns of occurrence frequency. The development of the composite-SDM framework used small-scale plant occurrence and UAV imagery from Kernen Prairie, a remnant Fescue prairie in Saskatoon, Saskatchewan. The evaluation of the effectiveness of five algorithms clearly showed that each method was capable of handling a wide range of low to high-frequency species with strong GLM performance irrespective of the species distribution pattern. It is critical to highlight that, although GLM is computationally efficient, the method does not compromise accuracy for simplicity. The inclusion of plant community structure using image clustering methods found similar accuracy patterns indicating limited advantages of using high-resolution images. The study found for high-frequency species that prediction accuracy declines to be as low as the accuracy expected for low-frequency species. Higher prediction confidence was often observed with low-frequency species when the species occurred in a distinct habitat that was visually and spectrally distinct from the surroundings. Such a pattern is in contrast to species widespread in different grassland habitats where distinct spectral signatures were lacking. The study has substantial evidence to state that the optimal algorithmic performance is tied to a balanced number of presences and absences in the data. The co-occurrence analysis also revealed significant co-occurrence patterns are most common at moderate levels of species occurrence frequencies. The research does not indicate any consistent accuracy changes between baseline direct reflectance models and composite-SDM framework. Although accuracy changes were marginal with the composite-SDM framework, the method is well capable of influencing associated type 1 and type 2 error rates of the classification

    Incorporating plant community structure in species distribution modelling: a species co-occurrence based composite approach

    Get PDF
    Species distribution models (SDM) with remotely sensed (RS) imagery is widely used in ecological studies and conservation planning, and the performance is frequently limited by factors including small plant size, small numbers of observations, and scattered distribution patterns. The focus of my thesis was to develop and evaluate alternative SDM methodologies to deal with such challenges. I used a record of nine endemic species occurrences from the Athabasca Sand Dunes in northern Saskatchewan to assess five different modelling algorithms including modern regression and machine learning techniques to understand how species distribution characteristics influence model prediction accuracies. All modelling algorithms showed robust performance (>0.5 AUC), with the best performance in most cases from generalized linear models (GLM). The threshold selection for presence-absence analysis highlights that actively selecting the optimum level is the best approach compared to the standard high threshold approach as with the latter there is a potential to deliver inconsistent predictions compared to observed patterns of occurrence frequency. The development of the composite-SDM framework used small-scale plant occurrence and UAV imagery from Kernen Prairie, a remnant Fescue prairie in Saskatoon, Saskatchewan. The evaluation of the effectiveness of five algorithms clearly showed that each method was capable of handling a wide range of low to high-frequency species with strong GLM performance irrespective of the species distribution pattern. It is critical to highlight that, although GLM is computationally efficient, the method does not compromise accuracy for simplicity. The inclusion of plant community structure using image clustering methods found similar accuracy patterns indicating limited advantages of using high-resolution images. The study found for high-frequency species that prediction accuracy declines to be as low as the accuracy expected for low-frequency species. Higher prediction confidence was often observed with low-frequency species when the species occurred in a distinct habitat that was visually and spectrally distinct from the surroundings. Such a pattern is in contrast to species widespread in different grassland habitats where distinct spectral signatures were lacking. The study has substantial evidence to state that the optimal algorithmic performance is tied to a balanced number of presences and absences in the data. The co-occurrence analysis also revealed significant co-occurrence patterns are most common at moderate levels of species occurrence frequencies. The research does not indicate any consistent accuracy changes between baseline direct reflectance models and composite-SDM framework. Although accuracy changes were marginal with the composite-SDM framework, the method is well capable of influencing associated type 1 and type 2 error rates of the classification
    • …
    corecore