6 research outputs found

    Modified balanced random forest for improving imbalanced data prediction

    Get PDF
    This paper proposes a Modified Balanced Random Forest (MBRF) algorithm as a classification technique to address imbalanced data. The MBRF process changes the process in a Balanced Random Forest by applying an under-sampling strategy based on clustering techniques for each data bootstrap decision tree in the Random Forest algorithm. To find the optimal performance of our proposed method compared with four clustering techniques, like: K-MEANS, Spectral Clustering, Agglomerative Clustering, and Ward Hierarchical Clustering. The experimental result show the Ward Hierarchical Clustering Technique achieved optimal performance, also the proposed MBRF method yielded better performance compared to the Balanced Random Forest (BRF) and Random Forest (RF) algorithms, with a sensitivity value or true positive rate (TPR) of 93.42%, a specificity or true negative rate (TNR) of 93.60%, and the best AUC accuracy value of 93.51%. Moreover, MBRF also reduced process running time

    A SLR on Customer Dropout Prediction

    Get PDF
    Dropout prediction is a problem that is being addressed with machine learning algorithms; thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as which features should be selected and how to measure accuracy while considering whether the features are appropriate according to the business context in which they are employed. To solve these questions, the goal of this paper is to develop a systematic literature review to evaluate the development of existing studies and to predict the dropout rate in contractual settings using machine learning to identify current trends and research opportunities. The results of this study identify trends in the use of machine learning algorithms in different business areas and in the adoption of machine learning algorithms, including which metrics are being adopted and what features are being applied. Finally, some research opportunities and gaps that could be explored in future research are presented.info:eu-repo/semantics/publishedVersio

    A SLR on Customer Dropout Prediction

    Get PDF
    Dropout prediction is a problem that is being addressed with machine learning algorithms; thus, appropriate approaches to address the dropout rate are needed. The selection of an algorithm to predict the dropout rate is only one problem to be addressed. Other aspects should also be considered, such as which features should be selected and how to measure accuracy while considering whether the features are appropriate according to the business context in which they are employed. To solve these questions, the goal of this paper is to develop a systematic literature review to evaluate the development of existing studies and to predict the dropout rate in contractual settings using machine learning to identify current trends and research opportunities. The results of this study identify trends in the use of machine learning algorithms in different business areas and in the adoption of machine learning algorithms, including which metrics are being adopted and what features are being applied. Finally, some research opportunities and gaps that could be explored in future research are presented.info:eu-repo/semantics/publishedVersio

    Oversampling technique in student performance classification from engineering course

    Get PDF
    The first year of an engineering student was important to take proper academic planning. All subjects in the first year were essential for an engineering basis. Student performance prediction helped academics improve their performance better. Students checked performance by themselves. If they were aware that their performance are low, then they could make some improvement for their better performance. This research focused on combining the oversampling minority class data with various kinds of classifier models. Oversampling techniques were SMOTE, Borderline-SMOTE, SVMSMOTE, and ADASYN and four classifiers were applied using MLP, gradient boosting, AdaBoost and random forest in this research. The results represented that Borderline-SMOTE gave the best result for minority class prediction with several classifiers

    Optimization of Imminent Labor Prediction Systems in Women with Threatened Preterm Labor Based on Electrohysterography

    Full text link
    [EN] Preterm birth is the leading cause of death in newborns and the survivors are prone to health complications. Threatened preterm labor (TPL) is the most common cause of hospitalization in the second half of pregnancy. The current methods used in clinical practice to diagnose preterm labor, the Bishop score or cervical length, have high negative predictive values but not positive ones. In this work we analyzed the performance of computationally efficient classification algorithms, based on electrohysterographic recordings (EHG), such as random forest (RF), extreme learning machine (ELM) and K-nearest neighbors (KNN) for imminent labor (<7 days) prediction in women with TPL, using the 50th or 10th-90th percentiles of temporal, spectral and nonlinear EHG parameters with and without obstetric data inputs. Two criteria were assessed for the classifier design: F1-score and sensitivity. RFF1_2 and ELMF1_2 provided the highest F1-score values in the validation dataset, (88.17 +/- 8.34% and 90.2 +/- 4.43%) with the 50th percentile of EHG and obstetric inputs. ELMF1_2 outperformed RFF1_2 in sensitivity, being similar to those of ELMSens (sensitivity optimization). The 10th-90th percentiles did not provide a significant improvement over the 50th percentile. KNN performance was highly sensitive to the input dataset, with a high generalization capability.This work was supported by the Spanish Ministry of Economy and Competitiveness, the European Regional Development Fund (MCIU/AEI/FEDER, UE RTI2018-094449-A-I00-AR); by the Generalitat Valenciana (AICO/2019/220).Prats-Boluda, G.; Pastor-Tronch, J.; Garcia-Casado, J.; Monfort-Ortiz, R.; Perales Marín, A.; Diago, V.; Roca Prats, A.... (2021). Optimization of Imminent Labor Prediction Systems in Women with Threatened Preterm Labor Based on Electrohysterography. Sensors. 21(7):1-18. https://doi.org/10.3390/s21072496S11821

    Políticas de Copyright de Publicações Científicas em Repositórios Institucionais: O Caso do INESC TEC

    Get PDF
    A progressiva transformação das práticas científicas, impulsionada pelo desenvolvimento das novas Tecnologias de Informação e Comunicação (TIC), têm possibilitado aumentar o acesso à informação, caminhando gradualmente para uma abertura do ciclo de pesquisa. Isto permitirá resolver a longo prazo uma adversidade que se tem colocado aos investigadores, que passa pela existência de barreiras que limitam as condições de acesso, sejam estas geográficas ou financeiras. Apesar da produção científica ser dominada, maioritariamente, por grandes editoras comerciais, estando sujeita às regras por estas impostas, o Movimento do Acesso Aberto cuja primeira declaração pública, a Declaração de Budapeste (BOAI), é de 2002, vem propor alterações significativas que beneficiam os autores e os leitores. Este Movimento vem a ganhar importância em Portugal desde 2003, com a constituição do primeiro repositório institucional a nível nacional. Os repositórios institucionais surgiram como uma ferramenta de divulgação da produção científica de uma instituição, com o intuito de permitir abrir aos resultados da investigação, quer antes da publicação e do próprio processo de arbitragem (preprint), quer depois (postprint), e, consequentemente, aumentar a visibilidade do trabalho desenvolvido por um investigador e a respetiva instituição. O estudo apresentado, que passou por uma análise das políticas de copyright das publicações científicas mais relevantes do INESC TEC, permitiu não só perceber que as editoras adotam cada vez mais políticas que possibilitam o auto-arquivo das publicações em repositórios institucionais, como também que existe todo um trabalho de sensibilização a percorrer, não só para os investigadores, como para a instituição e toda a sociedade. A produção de um conjunto de recomendações, que passam pela implementação de uma política institucional que incentive o auto-arquivo das publicações desenvolvidas no âmbito institucional no repositório, serve como mote para uma maior valorização da produção científica do INESC TEC.The progressive transformation of scientific practices, driven by the development of new Information and Communication Technologies (ICT), which made it possible to increase access to information, gradually moving towards an opening of the research cycle. This opening makes it possible to resolve, in the long term, the adversity that has been placed on researchers, which involves the existence of barriers that limit access conditions, whether geographical or financial. Although large commercial publishers predominantly dominate scientific production and subject it to the rules imposed by them, the Open Access movement whose first public declaration, the Budapest Declaration (BOAI), was in 2002, proposes significant changes that benefit the authors and the readers. This Movement has gained importance in Portugal since 2003, with the constitution of the first institutional repository at the national level. Institutional repositories have emerged as a tool for disseminating the scientific production of an institution to open the results of the research, both before publication and the preprint process and postprint, increase the visibility of work done by an investigator and his or her institution. The present study, which underwent an analysis of the copyright policies of INESC TEC most relevant scientific publications, allowed not only to realize that publishers are increasingly adopting policies that make it possible to self-archive publications in institutional repositories, all the work of raising awareness, not only for researchers but also for the institution and the whole society. The production of a set of recommendations, which go through the implementation of an institutional policy that encourages the self-archiving of the publications developed in the institutional scope in the repository, serves as a motto for a greater appreciation of the scientific production of INESC TEC
    corecore