1,243 research outputs found

    The Statistics of Bulk Segregant Analysis Using Next Generation Sequencing

    Get PDF
    We describe a statistical framework for QTL mapping using bulk segregant analysis (BSA) based on high throughput, short-read sequencing. Our proposed approach is based on a smoothed version of the standard statistic, and takes into account variation in allele frequency estimates due to sampling of segregants to form bulks as well as variation introduced during the sequencing of bulks. Using simulation, we explore the impact of key experimental variables such as bulk size and sequencing coverage on the ability to detect QTLs. Counterintuitively, we find that relatively large bulks maximize the power to detect QTLs even though this implies weaker selection and less extreme allele frequency differences. Our simulation studies suggest that with large bulks and sufficient sequencing depth, the methods we propose can be used to detect even weak effect QTLs and we demonstrate the utility of this framework by application to a BSA experiment in the budding yeast Saccharomyces cerevisiae

    Estimation of productivity in Korean electric power plants : a semiparametric smooth coefficient model

    Get PDF
    This paper analyzes the impact of load factor, facility and generator types on the productivity of Korean electric power plants. In order to capture important differences in the effect of load policy on power output, we use a semiparametric smooth coefficient (SPSC) model that allows us to model heterogeneous performances across power plants and over time by allowing underlying technologies to be heterogeneous. The SPSC model accommodates both continuous and discrete covariates. Various specification tests are conducted to assess the performance of the SPSC model. Using a unique generator level panel dataset spanning the period 1995–2006, we find that the impact of load factor, generator and facility types on power generation varies substantially in terms of magnitude and significance across different plant characteristics. The results have strong implications for generation policy in Korea as outlined in this study

    Machine Learning Approach for Risk-Based Inspection Screening Assessment

    Get PDF
    Risk-based inspection (RBI) screening assessment is used to identify equipment that makes a significant contribution to the system's total risk of failure (RoF), so that the RBI detailed assessment can focus on analyzing higher-risk equipment. Due to its qualitative nature and high dependency on sound engineering judgment, screening assessment is vulnerable to human biases and errors, and thus subject to output variability and threatens the integrity of the assets. This paper attempts to tackle these challenges by utilizing a machine learning approach to conduct screening assessment. A case study using a dataset of RBI assessment for oil and gas production and processing units is provided, to illustrate the development of an intelligent system, based on a machine learning model for performing RBI screening assessment. The best performing model achieves accuracy and precision of 92.33% and 84.58%, respectively. A comparative analysis between the performance of the intelligent system and the conventional assessment is performed to examine the benefits of applying the machine learning approach in the RBI screening assessment. The result shows that the application of the machine learning approach potentially improves the quality of the conventional RBI screening assessment output by reducing output variability and increasing accuracy and precision.acceptedVersio

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF

    A Systematic Review for Transformer-based Long-term Series Forecasting

    Full text link
    The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and adoption in TSF tasks. Transformers have proven to be the most successful solution to extract the semantic correlations among the elements within a long sequence. Various variants have enabled transformer architecture to effectively handle long-term time series forecasting (LTSF) tasks. In this article, we first present a comprehensive overview of transformer architectures and their subsequent enhancements developed to address various LTSF tasks. Then, we summarize the publicly available LTSF datasets and relevant evaluation metrics. Furthermore, we provide valuable insights into the best practices and techniques for effectively training transformers in the context of time-series analysis. Lastly, we propose potential research directions in this rapidly evolving field

    Projection pursuit random forest using discriminant feature analysis model for churners prediction in telecom industry

    Get PDF
    A major and demanding issue in the telecommunications industry is the prediction of churn customers. Churn describes the customer who is attrite from one Telecom service provider to competitors searching for better services offers. Companies from the Telco sector frequently have customer relationship management offices it is the main objective in how to win back defecting clients because preserve long-term customers can be much more beneficial to a company than gain newly recruited customers. Researchers and practitioners are paying great attention and investing more in developing a robust customer churn prediction model, especially in the telecommunication business by proposed numerous machine learning approaches. Many approaches of Classification are established, but the most effective in recent times is a tree-based method. The main contribution of this research is to predict churners/non-churners in the Telecom sector based on project pursuit Random Forest (PPForest) that uses discriminant feature analysis as a novelty extension of the conventional Random Forest approach for learning oblique Project Pursuit tree (PPtree). The proposed methodology leverages the advantage of two discriminant analysis methods to calculate the project index used in the construction of PPtree. The first method used Support Vector Machines (SVM) as a classifier in the construction of PPForest to differentiate between churners and non-churners customers. The second method is a Linear Discriminant Analysis (LDA) to achieve linear splitting of variables node during oblique PPtree construction to produce individual classifiers that are robust and more diverse than classical Random Forest. It found that the proposed methods enjoy the best performance measurements e.g. Accuracy, hit rate, ROC curve, Gini coefficient, Kolmogorov-Smirnov statistic and lift coefficient, H-measure, AUC. Moreover, PPForest based on direct applied of LDA on the raw data delivers an effective evaluator for the customer churn prediction model

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
    • …
    corecore