9 research outputs found

    A framework for automated anomaly detection in high frequency water-quality data from in situ sensors

    Full text link
    River water-quality monitoring is increasingly conducted using automated in situ sensors, enabling timelier identification of unexpected values. However, anomalies caused by technical issues confound these data, while the volume and velocity of data prevent manual detection. We present a framework for automated anomaly detection in high-frequency water-quality data from in situ sensors, using turbidity, conductivity and river level data. After identifying end-user needs and defining anomalies, we ranked their importance and selected suitable detection methods. High priority anomalies included sudden isolated spikes and level shifts, most of which were classified correctly by regression-based methods such as autoregressive integrated moving average models. However, using other water-quality variables as covariates reduced performance due to complex relationships among variables. Classification of drift and periods of anomalously low or high variability improved when we applied replaced anomalous measurements with forecasts, but this inflated false positive rates. Feature-based methods also performed well on high priority anomalies, but were also less proficient at detecting lower priority anomalies, resulting in high false negative rates. Unlike regression-based methods, all feature-based methods produced low false positive rates, but did not and require training or optimization. Rule-based methods successfully detected impossible values and missing observations. Thus, we recommend using a combination of methods to improve anomaly detection performance, whilst minimizing false detection rates. Furthermore, our framework emphasizes the importance of communication between end-users and analysts for optimal outcomes with respect to both detection performance and end-user needs. Our framework is applicable to other types of high frequency time-series data and anomaly detection applications

    Water Data Science: Data Driven Techniques, Training, and Tools for Improved Management of High Frequency Water Resources Data

    Get PDF
    Electronic sensors can measure water and climate conditions at high frequency and generate large quantities of observed data. This work addresses data management challenges associated with the volume and complexity of high frequency water data. We developed techniques for automatically reviewing data, created materials for training water data managers, and explored existing and emerging technologies for sensor data management. Data collected by sensors often include errors due to sensor failure or environmental conditions that need to be removed, labeled, or corrected before the data can be used for analysis. Manual review and correction of these data can be tedious and time consuming. To help automate these tasks, we developed a computer program that automatically checks the data for mistakes and attempts to fix them. This tool has the potential to save time and effort and is available to scientists and practitioners who use sensors to monitor water. Scientists may lack skillsets for working with sensor data because traditional engineering or science courses do not address how work with complex data with modern technology. We surveyed and interviewed instructors who teach courses related to “hydroinformatics” or “water data science” to understand challenges in incorporating data science techniques and tools into water resources teaching. Based on their feedback, we created educational materials that demonstrate how the articulated challenges can be effectively addressed to provide high-quality instruction. These materials are available online for students and teachers. In addition to skills for working with sensor data, scientists and engineers need tools for storing, managing, and sharing these data. Hydrologic information systems (HIS) help manage the data collected using sensors. HIS make sure that data can be effectively used by providing the computer infrastructure to get data from sensors in the field to secure data storage and then into the hands of scientists and others who use them. This work describes the evolution of software and standards that comprise HIS. We present the main components of HIS, describe currently available systems and gaps in technology or functionality, and then discuss opportunities for improved infrastructure that would make sensor data easier to collect, manage, and use. In short, we are trying to make sure that sensor data are good and useful; we’re helping instructors teach prospective data collectors and users about water and data; and we are making sure that the systems that enable collection, storage, management, and use of the data work smoothly

    Electronic Warfare Receiver Resource Management and Optimization

    Get PDF
    Optimization of electronic warfare (EW) receiver scan strategies is critical to improving the probability of surviving military missions in hostile environments. The problem is that the limited understanding of how dynamic variations in radar and EW receiver characteristics has influenced the response time to detect enemy threats. The dependent variable was the EW receiver response time and the 4 independent variables were EW receiver revisit interval, EW receiver dwell time, radar scan time, and radar illumination time. Previous researchers have not explained how dynamic variations of independent variables affected response time. The purpose of this experimental study was to develop a model to understand how dynamic variations of the independent variables influenced response time. Queuing theory provided the theoretical foundation for the study using Little\u27s formula to determine the ideal EW receiver revisit interval as it states the mathematical relationship among the variables. Findings from a simulation that produced 17,000 data points indicated that Little\u27s formula was valid for use in EW receivers. Findings also demonstrated that variation of the independent variables had a small but statistically significant effect on the average response time. The most significant finding was the sensitivity in the variance of response time given minor differences of the test conditions, which can lead to unexpectedly long response times. Military users and designers of EW systems benefit most from this study by optimizing system response time, thus improving survivability. Additionally, this research demonstrated a method that may improve EW product development times and reduce the cost to taxpayers through more efficient test and evaluation techniques


    Get PDF
    Cancer occurs when normal cells grow and multiply without normal control. As the cells multiply, they form an area of abnormal cells, known as a tumour. Many tumours exhibit abnormal chromosomal segregation at cell division. These anomalies play an important role in detecting molar pregnancy cancer. Molar pregnancy, also known as hydatidiform mole, can be categorised into partial (PHM) and complete (CHM) mole, persistent gestational trophoblastic and choriocarcinoma. Hydatidiform moles are most commonly found in women under the age of 17 or over the age of 35. Hydatidiform moles can be detected by morphological and histopathological examination. Even experienced pathologists cannot easily classify between complete and partial hydatidiform moles. However, the distinction between complete and partial hydatidiform moles is important in order to recommend the appropriate treatment method. Therefore, research into molar pregnancy image analysis and understanding is critical. The hypothesis of this research project is that an anomaly detection approach to analyse molar pregnancy images can improve image analysis and classification of normal PHM and CHM villi. The primary aim of this research project is to develop a novel method, based on anomaly detection, to identify and classify anomalous villi in molar pregnancy stained images. The novel method is developed to simulate expert pathologists’ approach in diagnosis of anomalous villi. The knowledge and heuristics elicited from two expert pathologists are combined with the morphological domain knowledge of molar pregnancy, to develop a heuristic multi-neural network architecture designed to classify the villi into their appropriated anomalous types. This study confirmed that a single feature cannot give enough discriminative power for villi classification. Whereas expert pathologists consider the size and shape before textural features, this thesis demonstrated that the textural feature has a higher discriminative power than size and shape. The first heuristic-based multi-neural network, which was based on 15 elicited features, achieved an improved average accuracy of 81.2%, compared to the traditional multi-layer perceptron (80.5%); however, the recall of CHM villi class was still low (64.3%). Two further textural features, which were elicited and added to the second heuristic-based multi-neural network, have improved the average accuracy from 81.2% to 86.1% and the recall of CHM villi class from 64.3% to 73.5%. The precision of the multi-neural network II has also increased from 82.7% to 89.5% for normal villi class, from 81.3% to 84.7% for PHM villi class and from 80.8% to 86% for CHM villi class. To support pathologists to visualise the results of the segmentation, a software tool, Hydatidiform Mole Analysis Tool (HYMAT), was developed compiling the morphological and pathological data for each villus analysis