11,850 research outputs found

    Anomaly detection for machine learning redshifts applied to SDSS galaxies

    Full text link
    We present an analysis of anomaly detection for machine learning redshift estimation. Anomaly detection allows the removal of poor training examples, which can adversely influence redshift estimates. Anomalous training examples may be photometric galaxies with incorrect spectroscopic redshifts, or galaxies with one or more poorly measured photometric quantity. We select 2.5 million 'clean' SDSS DR12 galaxies with reliable spectroscopic redshifts, and 6730 'anomalous' galaxies with spectroscopic redshift measurements which are flagged as unreliable. We contaminate the clean base galaxy sample with galaxies with unreliable redshifts and attempt to recover the contaminating galaxies using the Elliptical Envelope technique. We then train four machine learning architectures for redshift analysis on both the contaminated sample and on the preprocessed 'anomaly-removed' sample and measure redshift statistics on a clean validation sample generated without any preprocessing. We find an improvement on all measured statistics of up to 80% when training on the anomaly removed sample as compared with training on the contaminated sample for each of the machine learning routines explored. We further describe a method to estimate the contamination fraction of a base data sample.Comment: 13 pages, 8 figures, 1 table, minor text updates to macth MNRAS accepted versio

    Robust Inference in Wireless Sensor Networks

    Get PDF
    This dissertation presents a systematic approach to obtain robust statistical inference schemes in unreliable networks. Statistical inference offers mechanisms for deducing the statistical properties of unknown parameters from the data. In Wireless Sensor Networks (WSNs), sensor outputs are transmitted across a wireless communication network to the fusion center (FC) for final decision-making. The sensor data are not always reliable. Some factors may cause anomaly in network operations, such as malfunction, corruption, or compromised due to some unknown source of contamination or adversarial attacks. Two standard component failure models are adopted in this study to describe the system vulnerability: the probabilistic and static models. In probabilistic models, we consider a widely known ε−contamination model, where each node has ε probability of malfunctioning or being compromised. In contrast, the static model assumes there is up to a certain number of malfunctioning nodes. It is assumed that the decision center/network operator is aware of the presence of anomaly nodes and can adjust the operation rule to counter the impact of the anomaly. The anomaly node is assumed to know that the network operator is taking some defensive actions to improve its performance. Considering both the decision center (network operator) and compromised (anomalous) nodes and their possible actions, the problem is formulated as a two-player zero-sum game. Under this setting, we attempt to discover the worst possible failure models and best possible operating strategies. First, the effect of sensor unreliability on detection performance is investigated, and robust detection schemes are proposed. The aim is to design robust detectors when some observation nodes malfunction. The detection problem is relatively well known under the probabilistic model in simple binary hypotheses testing with known saddle-point solutions. The detection problem is investigated under the mini-max framework for the static settings as no such saddle point solutions are shown to exist under these settings. In the robust estimation, results in estimation theory are presented to measure system robustness and performance. The estimation theory covers probabilistic and static component failure models. Besides the standard approaches of robust estimation under the frequentist settings where the interesting parameters are fixed but unknown, the estimation problem under the Bayes settings is considered where the prior probability distribution is known. After first establishing the general framework, comprehensive results on the particular case of a single node network are presented under the probabilistic settings. Based on the insights from the single node network, we investigate the robust estimation problem for the general network for both failure models. A few robust localization methods are presented as an extension of robust estimation theory at the end

    Graph Laplacian for Image Anomaly Detection

    Get PDF
    Reed-Xiaoli detector (RXD) is recognized as the benchmark algorithm for image anomaly detection; however, it presents known limitations, namely the dependence over the image following a multivariate Gaussian model, the estimation and inversion of a high-dimensional covariance matrix, and the inability to effectively include spatial awareness in its evaluation. In this work, a novel graph-based solution to the image anomaly detection problem is proposed; leveraging the graph Fourier transform, we are able to overcome some of RXD's limitations while reducing computational cost at the same time. Tests over both hyperspectral and medical images, using both synthetic and real anomalies, prove the proposed technique is able to obtain significant gains over performance by other algorithms in the state of the art.Comment: Published in Machine Vision and Applications (Springer

    Computer Vision Applications for Autonomous Aerial Vehicles

    Get PDF
    Undoubtedly, unmanned aerial vehicles (UAVs) have experienced a great leap forward over the last decade. It is not surprising anymore to see a UAV being used to accomplish a certain task, which was previously carried out by humans or a former technology. The proliferation of special vision sensors, such as depth cameras, lidar sensors and thermal cameras, and major breakthroughs in computer vision and machine learning fields accelerated the advance of UAV research and technology. However, due to certain unique challenges imposed by UAVs, such as limited payload capacity, unreliable communication link with the ground stations and data safety, UAVs are compelled to perform many tasks on their onboard embedded processing units, which makes it difficult to readily implement the most advanced algorithms on UAVs. This thesis focuses on computer vision and machine learning applications for UAVs equipped with onboard embedded platforms, and presents algorithms that utilize data from multiple modalities. The presented work covers a broad spectrum of algorithms and applications for UAVs, such as indoor UAV perception, 3D understanding with deep learning, UAV localization, and structural inspection with UAVs. Visual guidance and scene understanding without relying on pre-installed tags or markers is the desired approach for fully autonomous navigation of UAVs in conjunction with the global positioning systems (GPS), or especially when GPS information is either unavailable or unreliable. Thus, semantic and geometric understanding of the surroundings become vital to utilize vision as guidance in the autonomous navigation pipelines. In this context, first, robust altitude measurement, safe landing zone detection and doorway detection methods are presented for autonomous UAVs operating indoors. These approaches are implemented on Google Project Tango platform, which is an embedded platform equipped with various sensors including a depth camera. Next, a modified capsule network for 3D object classification is presented with weight optimization so that the network can be fit and run on memory-constrained platforms. Then, a semantic segmentation method for 3D point clouds is developed for a more general visual perception on a UAV equipped with a 3D vision sensor. Next, this thesis presents algorithms for structural health monitoring applications involving UAVs. First, a 3D point cloud-based, drift-free and lightweight localization method is presented for depth camera-equipped UAVs that perform bridge inspection, where GPS signal is unreliable. Next, a thermal leakage detection algorithm is presented for detecting thermal anomalies on building envelopes using aerial thermography from UAVs. Then, building on our thermal anomaly identification expertise gained on the previous task, a novel performance anomaly identification metric (AIM) is presented for more reliable performance evaluation of thermal anomaly identification methods

    Lifeguard: Local Health Awareness for More Accurate Failure Detection

    Full text link
    SWIM is a peer-to-peer group membership protocol with attractive scaling and robustness properties. However, slow message processing can cause SWIM to mark healthy members as failed (so called false positive failure detection), despite inclusion of a mechanism to avoid this. We identify the properties of SWIM that lead to the problem, and propose Lifeguard, a set of extensions to SWIM which consider that the local failure detector module may be at fault, via the concept of local health. We evaluate this approach in a precisely controlled environment and validate it in a real-world scenario, showing that it drastically reduces the rate of false positives. The false positive rate and detection time for true failures can be reduced simultaneously, compared to the baseline levels of SWIM
    corecore