8,899 research outputs found

    A statistical model for isolated convective precipitation events

    Get PDF
    To study the diurnal evolution of the convective cloud field, we develop a precipitation cell tracking algorithm which records the merging and fragmentation of convective cells during their life cycles, and apply it on large eddy simulation (LES) data. Conditioning on the area covered by each cell, our algorithm is capable of analyzing an arbitrary number of auxiliary fields, such as the anomalies of temperature and moisture, convective available potential energy (CAPE) and convective inhibition (CIN). For tracks that do not merge or split (termed "solitary"), many of these quantities show generic, often nearly linear relations that hardly depend on the forcing conditions of the simulations, such as surface temperature. This finding allows us to propose a highly idealized model of rain events, where the surface precipitation area is circular and a cell's precipitation intensity falls off linearly with the distance from the respective cell center. The drop-off gradient is nearly independent of track duration and cell size, which allows for a generic description of such solitary tracks, with the only remaining parameter the peak intensity. In contrast to the simple and robust behavior of solitary tracks, tracks that result from merging of two or more cells show a much more complicated behavior. The most intense, long lasting and largest tracks indeed stem from multi-mergers - tracks involved in repeated merging. Another interesting finding is that the precipitation intensity of tracks does not strongly depend on the absolute amount of local initial CAPE, which is only partially consumed by most rain events. Rather, our results speak to boundary layer cooling, induced by rain re-evaporation, as the cause for CAPE reduction, CIN increase and shutdown of precipitation cells.Comment: Manuscript under review in Journal of Advances in Modeling Earth System

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Time Series Outlier Detection Based on Sliding Window Prediction

    Get PDF
    In order to detect outliers in hydrological time series data for improving data quality and decision-making quality related to design, operation, and management of water resources, this research develops a time series outlier detection method for hydrologic data that can be used to identify data that deviate from historical patterns. The method first built a forecasting model on the history data and then used it to predict future values. Anomalies are assumed to take place if the observed values fall outside a given prediction confidence interval (PCI), which can be calculated by the predicted value and confidence coefficient. The use of PCI as threshold is mainly on the fact that it considers the uncertainty in the data series parameters in the forecasting model to address the suitable threshold selection problem. The method performs fast, incremental evaluation of data as it becomes available, scales to large quantities of data, and requires no preclassification of anomalies. Experiments with different hydrologic real-world time series showed that the proposed methods are fast and correctly identify abnormal data and can be used for hydrologic time series analysis

    Techniques for clustering gene expression data

    Get PDF
    Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

    Spectrum Anomaly Detection for Optical Network Monitoring using Deep Unsupervised Learning

    Get PDF
    Accurate and efficient anomaly detection is a key enabler for the cognitive management of optical networks, but traditional anomaly detection algorithms are computationally complex and do not scale well with the amount of monitoring data. Therefore, we propose an optical spectrum anomaly detection scheme that exploits computer vision and deep unsupervised learning to perform optical network monitoring relying only on constellation diagrams of received signals. The proposed scheme achieves 100% detection accuracy even without prior knowledge of the anomalies. Furthermore, operation with encoded images of constellation diagrams reduces the runtime by up to 200 times

    Predicting Pilot Misperception of Runway Excursion Risk Through Machine Learning Algorithms of Recorded Flight Data

    Get PDF
    The research used predictive models to determine pilot misperception of runway excursion risk associated with unstable approaches. The Federal Aviation Administration defined runway excursion as a veer-off or overrun of the runway surface. The Federal Aviation Administration also defined a stable approach as an aircraft meeting the following criteria: (a) on target approach airspeed, (b) correct attitude, (c) landing configuration, (d) nominal descent angle/rate, and (e) on a straight flight path to the runway touchdown zone. Continuing an unstable approach to landing was defined as Unstable Approach Risk Misperception in this research. A review of the literature revealed that an unstable approach followed by the failure to execute a rejected landing was a common contributing factor in runway excursions. Flight Data Recorder data were archived and made available by the National Aeronautics and Space Administration for public use. These data were collected over a four-year period from the flight data recorders of a fleet of 35 regional jets operating in the National Airspace System. The archived data were processed and explored for evidence of unstable approaches and to determine whether or not a rejected landing was executed. Once identified, those data revealing evidence of unstable approaches were processed for the purposes of building predictive models. SASℱ Enterprise MinerR was used to explore the data, as well as to build and assess predictive models. The advanced machine learning algorithms utilized included: (a) support vector machine, (b) random forest, (c) gradient boosting, (d) decision tree, (e) logistic regression, and (f) neural network. The models were evaluated and compared to determine the best prediction model. Based on the model comparison, the decision tree model was determined to have the highest predictive value. The Flight Data Recorder data were then analyzed to determine predictive accuracy of the target variable and to determine important predictors of the target variable, Unstable Approach Risk Misperception. Results of the study indicated that the predictive accuracy of the best performing model, decision tree, was 99%. Findings indicated that six variables stood out in the prediction of Unstable Approach Risk Misperception: (1) glideslope deviation, (2) selected approach speed deviation (3) localizer deviation, (4) flaps not extended, (5) drift angle, and (6) approach speed deviation. These variables were listed in order of importance based on results of the decision tree predictive model analysis. The results of the study are of interest to aviation researchers as well as airline pilot training managers. It is suggested that the ability to predict the probability of pilot misperception of runway excursion risk could influence the development of new pilot simulator training scenarios and strategies. The research aids avionics providers in the development of predictive runway excursion alerting display technologies

    Exploring the impact of data poisoning attacks on machine learning model reliability

    Get PDF
    Recent years have seen the widespread adoption of Artificial Intelligence techniques in several domains, including healthcare, justice, assisted driving and Natural Language Processing (NLP) based applications (e.g., the Fake News detection). Those mentioned are just a few examples of some domains that are particularly critical and sensitive to the reliability of the adopted machine learning systems. Therefore, several Artificial Intelligence approaches were adopted as support to realize easy and reliable solutions aimed at improving the early diagnosis, personalized treatment, remote patient monitoring and better decision-making with a consequent reduction of healthcare costs. Recent studies have shown that these techniques are venerable to attacks by adversaries at phases of artificial intelligence. Poisoned data set are the most common attack to the reliability of Artificial Intelligence approaches. Noise, for example, can have a significant impact on the overall performance of a machine learning model. This study discusses the strength of impact of noise on classification algorithms. In detail, the reliability of several machine learning techniques to distinguish correctly pathological and healthy voices by analysing poisoning data was evaluated. Voice samples selected by available database, widely used in research sector, the Saarbruecken Voice Database, were processed and analysed to evaluate the resilience and classification accuracy of these techniques. All analyses are evaluated in terms of accuracy, specificity, sensitivity, F1-score and ROC area

    A Proposal for a Three Detector Short-Baseline Neutrino Oscillation Program in the Fermilab Booster Neutrino Beam

    Get PDF
    A Short-Baseline Neutrino (SBN) physics program of three LAr-TPC detectors located along the Booster Neutrino Beam (BNB) at Fermilab is presented. This new SBN Program will deliver a rich and compelling physics opportunity, including the ability to resolve a class of experimental anomalies in neutrino physics and to perform the most sensitive search to date for sterile neutrinos at the eV mass-scale through both appearance and disappearance oscillation channels. Using data sets of 6.6e20 protons on target (P.O.T.) in the LAr1-ND and ICARUS T600 detectors plus 13.2e20 P.O.T. in the MicroBooNE detector, we estimate that a search for muon neutrino to electron neutrino appearance can be performed with ~5 sigma sensitivity for the LSND allowed (99% C.L.) parameter region. In this proposal for the SBN Program, we describe the physics analysis, the conceptual design of the LAr1-ND detector, the design and refurbishment of the T600 detector, the necessary infrastructure required to execute the program, and a possible reconfiguration of the BNB target and horn system to improve its performance for oscillation searches.Comment: 209 pages, 129 figure
    • 

    corecore