1,051 research outputs found

    GumDrop at the DISRPT2019 Shared Task: A Model Stacking Approach to Discourse Unit Segmentation and Connective Detection

    Full text link
    In this paper we present GumDrop, Georgetown University's entry at the DISRPT 2019 Shared Task on automatic discourse unit segmentation and connective detection. Our approach relies on model stacking, creating a heterogeneous ensemble of classifiers, which feed into a metalearner for each final task. The system encompasses three trainable component stacks: one for sentence splitting, one for discourse unit segmentation and one for connective detection. The flexibility of each ensemble allows the system to generalize well to datasets of different sizes and with varying levels of homogeneity.Comment: Proceedings of Discourse Relation Parsing and Treebanking (DISRPT2019

    A Literature Review of Fault Diagnosis Based on Ensemble Learning

    Get PDF
    The accuracy of fault diagnosis is an important indicator to ensure the reliability of key equipment systems. Ensemble learning integrates different weak learning methods to obtain stronger learning and has achieved remarkable results in the field of fault diagnosis. This paper reviews the recent research on ensemble learning from both technical and field application perspectives. The paper summarizes 87 journals in recent web of science and other academic resources, with a total of 209 papers. It summarizes 78 different ensemble learning based fault diagnosis methods, involving 18 public datasets and more than 20 different equipment systems. In detail, the paper summarizes the accuracy rates, fault classification types, fault datasets, used data signals, learners (traditional machine learning or deep learning-based learners), ensemble learning methods (bagging, boosting, stacking and other ensemble models) of these fault diagnosis models. The paper uses accuracy of fault diagnosis as the main evaluation metrics supplemented by generalization and imbalanced data processing ability to evaluate the performance of those ensemble learning methods. The discussion and evaluation of these methods lead to valuable research references in identifying and developing appropriate intelligent fault diagnosis models for various equipment. This paper also discusses and explores the technical challenges, lessons learned from the review and future development directions in the field of ensemble learning based fault diagnosis and intelligent maintenance

    Prediction of Rebound Amount in Dry Mix Shotcrete by a Fast Adaboosting Neural Network

    Get PDF
    In this study, a new machine learning approach has been proposed to predict the rebound causing loss of material in shotcrete using the ensemble learning method. In shotcrete application, the amount of rebound material was obtained for use in a dataset. In this study, the shotcrete mixes that contain an additive of fly-ash, silica fume, and polypropylene fiber were produced besides simple shotcrete. Each mix was sprayed onto 2 wooden panels measuring 45 × 45 × 15 cm in size. The rebound material resulting from the spraying process was collected, weighed and recorded as data. The highest rebound was observed for the plain sample and the lowest for samples with substituted silica fume. Dependent and independent parameters were identified in the dataset produced as a result of experimental studies. Hyperparameters producing optimum results in the training of the model were identified for the model and boosting method. The dataset was split into training and testing sets by 80% and 20%, respectively. As a result, the model achieved a prediction performance of 84.25%. To test the performance of the proposed model, traditional machine learning algorithms were compared on the same dataset. Consequently, the proposed model was observed to have the highest accuracy

    Simulating soil salinity dynamics, cotton yield and evapotranspiration under drip irrigation by ensemble machine learning

    Get PDF
    We thank the China Scholarship Council (CSC) for providing a scholarship (202206710073) to Zewei Jiang. This work was supported by the Fundamental Research Funds for the Central Universities (B220203009), the Postgraduate Research & Practice Program of Jiangsu Province (KYCX22_0669), the Water Conservancy Science and Technology Project of Jiangxi Province (201921ZDKT06, 202124ZDKT09), the National Natural Science Foundation of China (51879076), the Fundamental Research Funds for the Central Universities (B210204016), Science & Technology Specific Projects in Agricultural High-tech Industrial Demonstration Area of the Yellow River Delta, Grant No: 2022SZX01.Peer reviewedPublisher PD

    Empirical evaluation of optimized stacking configurations

    Get PDF
    Proceeding of: 16th IEEE International Conference on Tools with Artificial Intelligence, 15-17 Nov. 2004, Boca Ratón, FloridaStacking is one of the most used techniques for combining classifiers and improves prediction accuracy. Early research in stacking showed that selecting the right classifiers, their parameters and the metaclassifiers was the main bottleneck for its use. Most of the research on this topic selects by hand the right combination of classifiers and their parameters. Instead of starting from these initial strong assumptions, our approach uses genetic algorithms to search for good stacking configurations. Since this can lead to overfitting, one of the goals of This work is to evaluate empirically the overall efficiency of the approach. A second goal is to compare our approach with current best stacking building techniques. The results show that our approach finds stacking configurations that, in the worst case, perform as well as the best techniques, with the advantage of not having to set up manually the structure of the stacking system.Publicad

    Data sparsity in highly inflected languages: the case of morphosyntactic tagging in Polish

    Get PDF
    In morphologically complex languages, many high-level tasks in natural language processing rely on accurate morphosyntactic analyses of the input. However, in light of the risk of error propagation in present-day pipeline architectures for basic linguistic pre-processing, the state of the art for morphosyntactic tagging is still not satisfactory. The main obstacle here is data sparsity inherent to natural lan- guage in general and highly inflected languages in particular. In this work, we investigate whether semi-supervised systems may alleviate the data sparsity problem. Our approach uses word clusters obtained from large amounts of unlabelled text in an unsupervised manner in order to provide a su- pervised probabilistic tagger with morphologically informed features. Our evalua- tions on a number of datasets for the Polish language suggest that this simple technique improves tagging accuracy, especially with regard to out-of-vocabulary words. This may prove useful to increase cross-domain performance of taggers, and to alleviate the dependency on large amounts of supervised training data, which is especially important from the perspective of less-resourced languages

    Spartan Face Mask Detection and Facial Recognition System

    Get PDF
    According to the World Health Organization (WHO), wearing a face mask is one of the most effective protections from airborne infectious diseases such as COVID-19. Since the spread of COVID-19, infected countries have been enforcing strict mask regulation for indoor businesses and public spaces. While wearing a mask is a requirement, the position and type of the mask should also be considered in order to increase the effectiveness of face masks, especially at specific public locations. However, this makes it difficult for conventional facial recognition technology to identify individuals for security checks. To solve this problem, the Spartan Face Detection and Facial Recognition System with stacking ensemble deep learning algorithms is proposed to cover four major issues: Mask Detection, Mask Type Classification, Mask Position Classification and Identity Recognition. CNN, AlexNet, VGG16, and Facial Recognition Pipeline with FaceNet are the Deep Learning algorithms used to classify the features in each scenario. This system is powered by five components including training platform, server, supporting frameworks, hardware, and user interface. Complete unit tests, use cases, and results analytics are used to evaluate and monitor the performance of the system. The system provides cost-efficient face detection and facial recognition with masks solutions for enterprises and schools that can be easily applied on edge-devices
    corecore