958 research outputs found

    Bayesian joint models with INLA exploring marine mobile predator-prey and competitor species habitat overlap

    Get PDF
    EPSRC grant Ecowatt 2050 EP/K012851/1 ACKNOWLEDGMENTS We would like to thank the associate editor and the anonymous reviewers for their useful and constructive suggestions which led to a considerable improvement of the manuscript. The authors would also like to thank the following people/organizations for making large datasets available for use in this paper: Mark Lewis (Joint Nature Conservation Committee), Philip Hammond (Scottish Oceans Institute, University of St. Andrews), Susan Lusseau (Marine Scotland Science), Darren Stevens (The Sir Alister Hardy Foundation for Ocean Science, PML), and Yuri Artioli (Plymouth Marine Laboratory). This work was supported by the Engineering and Physical Sciences Research Council (EcoWatt250; EPSRC EP/K012851/1).Peer reviewedPublisher PD

    Features extraction using random matrix theory.

    Get PDF
    Representing the complex data in a concise and accurate way is a special stage in data mining methodology. Redundant and noisy data affects generalization power of any classification algorithm, undermines the results of any clustering algorithm and finally encumbers the monitoring of large dynamic systems. This work provides several efficient approaches to all aforementioned sides of the analysis. We established, that notable difference can be made, if the results from the theory of ensembles of random matrices are employed. Particularly important result of our study is a discovered family of methods based on projecting the data set on different subsets of the correlation spectrum. Generally, we start with traditional correlation matrix of a given data set. We perform singular value decomposition, and establish boundaries between essential and unimportant eigen-components of the spectrum. Then, depending on the nature of the problem at hand we either use former or later part for the projection purpose. Projecting the spectrum of interest is a common technique in linear and non-linear spectral methods such as Principal Component Analysis, Independent Component Analysis and Kernel Principal Component Analysis. Usually the part of the spectrum to project is defined by the amount of variance of overall data or feature space in non-linear case. The applicability of these spectral methods is limited by the assumption that larger variance has important dynamics, i.e. if the data has a high signal-to-noise ratio. If it is true, projection of principal components targets two problems in data mining, reduction in the number of features and selection of more important features. Our methodology does not make an assumption of high signal-to-noise ratio, instead, using the rigorous instruments of Random Matrix Theory (RNIT) it identifies the presence of noise and establishes its boundaries. The knowledge of the structure of the spectrum gives us possibility to make more insightful projections. For instance, in the application to router network traffic, the reconstruction error procedure for anomaly detection is based on the projection of noisy part of the spectrum. Whereas, in bioinformatics application of clustering the different types of leukemia, implicit denoising of the correlation matrix is achieved by decomposing the spectrum to random and non-random parts. For temporal high dimensional data, spectrum and eigenvectors of its correlation matrix is another representation of the data. Thus, eigenvalues, components of the eigenvectors, inverse participation ratio of eigenvector components and other operators of eigen analysis are spectral features of dynamic system. In our work we proposed to extract spectral features using the RMT. We demonstrated that with extracted spectral features we can monitor the changing dynamics of network traffic. Experimenting with the delayed correlation matrices of network traffic and extracting its spectral features, we visualized the delayed processes in the system. We demonstrated in our work that broad range of applications in feature extraction can benefit from the novel RMT based approach to the spectral representation of the data

    Defense by Deception against Stealthy Attacks in Power Grids

    Get PDF
    Cyber-physical Systems (CPSs) and the Internet of Things (IoT) are converging towards a hybrid platform that is becoming ubiquitous in all modern infrastructures. The integration of the complex and heterogeneous systems creates enormous space for the adversaries to get into the network and inject cleverly crafted false data into measurements, misleading the control center to make erroneous decisions. Besides, the attacker can make a critical part of the system unavailable by compromising the sensor data availability. To obfuscate and mislead the attackers, we propose DDAF, a deceptive data acquisition framework for CPSs\u27 hierarchical communication network. Each switch in the hierarchical communication network generates a random pattern of addresses/IDs by shuffling the original sensor IDs reported through it. During the data acquisition from remotely located sensors to the central controller, the switches craft the network packets by replacing a few sensors\u27 associated addresses/IDs with the generated deceptive IDs and by adding decoy data for the rest. While misleading the attackers, the control center must retrieve the actual data to operate the system correctly. We propose three remapping mechanisms (e.g., seed-based, prediction-based, and hybrid) and compare their robustness against different stealthy attacks. Due to the deception, artfully altered measurements turn into random data injections, making it easy to remove them as outliers. As the outliers and the estimated residuals contain the potential attack vectors, DDAF can detect and localize the attack points and the targeted sensors by analyzing this information. DDAF is generic and scalable to be implemented in any hierarchical CPSs network. Experimental results on the standard IEEE 14, 57, and 300 bus power systems show that DDAF can detect, mitigate, and localize up-to 100% of the stealthy cyberattacks. To the best of our knowledge, this is the first framework that implements complete randomization in the data acquisition of the hierarchical CPSs

    Separating internal and externally-forced contributions to global temperature variability using a Bayesian stochastic energy balance framework

    Full text link
    Earth's temperature variability can be partitioned into internal and externally-forced components. Yet, underlying mechanisms and their relative contributions remain insufficiently understood, especially on decadal to centennial timescales. Important reasons for this are difficulties in isolating internal and externally-forced variability. Here, we provide a physically-motivated emulation of global mean surface temperature (GMST) variability, which allows for the separation of internal and external variations. To this end, we introduce the "ClimBayes" software package, which infers climate parameters from a stochastic energy balance model (EBM) with a Bayesian approach. We apply our method to GMST data from temperature observations and 20 last millennium simulations from climate models of intermediate to high complexity. This yields the best estimates of the EBM's forced and forced + internal response, which we refer to as emulated variability. The timescale-dependent variance is obtained from spectral analysis. In particular, we contrast the emulated forced and forced + internal variance on interannual to centennial timescales with that of the GMST target. Our findings show that a stochastic EBM closely approximates the power spectrum and timescale-dependent variance of GMST as simulated by modern climate models. This demonstrates the potential of combining Bayesian inference with conceptual climate models to emulate statistics of climate variables across timescales.Comment: The following article has been submitted to Chaos: An Interdisciplinary Journal of Nonlinear Science. After it is published, it will be found at https://aip.scitation.org/journal/ch

    Development of life prediction capabilities for liquid propellant rocket engines. Post-fire diagnostic system for the SSME system architecture study

    Get PDF
    This system architecture task (1) analyzed the current process used to make an assessment of engine and component health after each test or flight firing of an SSME, (2) developed an approach and a specific set of objectives and requirements for automated diagnostics during post fire health assessment, and (3) listed and described the software applications required to implement this system. The diagnostic system described is a distributed system with a database management system to store diagnostic information and test data, a CAE package for visual data analysis and preparation of plots of hot-fire data, a set of procedural applications for routine anomaly detection, and an expert system for the advanced anomaly detection and evaluation

    nanoHUB Database Analysis: Using Anomaly Detection Method and Principal Component Analysis

    Get PDF
    This thesis analyzes usage data from nanoHUB.org, which is a web-based infrastructure for e-collaboration among nanotechnology simulation community. Previous analysis of nanoHUB database showed he nanoHUB usage data follows an unknown, heavy-tailed distributions. This thesis extends the analysis and develops an automatic anomaly detection method based on piece-wise linear approximation. The anomaly here refers to collective user behaviors different from others. The result shows that the method can accurately detect the anomalies in the unknown, heavily detailed distribution. This thesis also applies anomaly detection method and principal component analysis to other databases in nanoHUB and successfully reveals differences between different categories
    corecore