14 research outputs found

    Selective oversampling approach for strongly imbalanced data

    Get PDF
    Challenges posed by imbalanced data are encountered in many real-world applications. One of the possible approaches to improve the classifier performance on imbalanced data is oversampling. In this paper, we propose the new selective oversampling approach (SOA) that first isolates the most representative samples from minority classes by using an outlier detection technique and then utilizes these samples for synthetic oversampling. We show that the proposed approach improves the performance of two state-of-the-art oversampling methods, namely, the synthetic minority oversampling technique and adaptive synthetic sampling. The prediction performance is evaluated on four synthetic datasets and four real-world datasets, and the proposed SOA methods always achieved the same or better performance than other considered existing oversampling methods

    Comparison of Filter Techniques for Two-Step Feature Selection

    Get PDF
    In the last decade, the processing of the high dimensional data became inevitable task in many areas of research and daily life. Feature selection (FS), as part of the data processing methodology, is an important step in knowledge discovery. This paper proposes nine variation of two-step feature selection approach with filter FS employed in the first step and exhaustive search in the second step. The performance of the proposed methods is comparatively analysed from the stability and predictive performance point of view. As the obtained results indicate the choice of the filter FS in the first stage has strong influence on the resulting stability. Here, the choice of univariate Pearson correlation coefficient based FS method appears to provide the most stable results

    Predikcia úpadku spoločností s ručením obmedzeným využitím metód pre rozpoznanie odľahlých bodov

    Get PDF
    Spoločnosti pôsobiace v rámci obchodného a priemyselného odvet-via sa môžu vplyvom nepriaznivej finančnej situácie, alebo nevhodného obcho-dovania, dostať do finančných ťažkostí, ktoré neskôr vyústia do celkového úpadku spoločnosti. Analyzovali sme dáta obsahujúce tisíce záznamov spoloč-ností s ručením obmedzeným (s.r.o) pôsobiacich na Slovensku v rôznych od-vetviach hospodárstva v období rokov 2013-2016. K nastolenému problému sme pristupovali ako k problému rozpoznania odľahlých hodnôt (outliers), pri-čom bola použitá metóda podporných vektorov pre detekciu odľahlých bodov (OneClassSVM). Dáta pozostávali z 20 štandardných ekonomických ukazova-teľov. V prvotnej analýze sme sa zamerali na predikciu úpadku s.r.o. na základe účtovných údajov z jedného roku a kombináciou dvoch po sebe idúcich rokov. Dosiahnutá presnosť predikcie bola od 60,56% do 77,91 % v závislosti od roku v ktorom sme uvažovali výsledný stav spoločnosti a roku z ktorého boli čerpané ekonomické ukazovatele

    Robustness of Interval Monge Matrices in Fuzzy Algebra

    No full text
    Max–min algebra (called also fuzzy algebra) is an extremal algebra with operations maximum and minimum. In this paper, we study the robustness of Monge matrices with inexact data over max–min algebra. A matrix with inexact data (also called interval matrix) is a set of matrices given by a lower bound matrix and an upper bound matrix. An interval Monge matrix is the set of all Monge matrices from an interval matrix with Monge lower and upper bound matrices. There are two possibilities to define the robustness of an interval matrix. First, the possible robustness, if there is at least one robust matrix. Second, universal robustness, if all matrices are robust in the considered set of matrices. We found necessary and sufficient conditions for universal robustness in cases when the lower bound matrix is trivial. Moreover, we proved necessary conditions for possible robustness and equivalent conditions for universal robustness in cases where the lower bound matrix is non-trivial

    Bankruptcy prediction using ensemble of autoencoders optimized by genetic algorithm

    No full text
    The prediction of imminent bankruptcy for a company is important to banks, government agencies, business owners, and different business stakeholders. Bankruptcy is influenced by many global and local aspects, so it can hardly be anticipated without deeper analysis and economic modeling knowledge. To make this problem even more challenging, the available bankruptcy datasets are usually imbalanced since even in times of financial crisis, bankrupt companies constitute only a fraction of all operating businesses. In this article, we propose a novel bankruptcy prediction approach based on a shallow autoencoder ensemble that is optimized by a genetic algorithm. The goal of the autoencoders is to learn the distribution of the majority class: going concern businesses. Then, the bankrupt companies are represented by higher autoencoder reconstruction errors. The choice of the optimal threshold value for the reconstruction error, which is used to differentiate between bankrupt and nonbankrupt companies, is crucial and determines the final classification decision. In our approach, the threshold for each autoencoder is determined by a genetic algorithm. We evaluate the proposed method on four different datasets containing small and medium-sized enterprises. The results show that the autoencoder ensemble is able to identify bankrupt companies with geometric mean scores ranging from 71% to 93.7%, (depending on the industry and evaluation year)

    Machine Learning Approach to Dysphonia Detection

    No full text
    This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy

    Response To a Global Economic Crisis - Impacts on Private Sector

    No full text
    This paper is focused on analyzing the impacts of the governmental initiatives as a response to the global economic crisis on the private sectors in different countries. Macroeconomic and microeconomic backgrounds of the crisis are discussed at the beginning in order to understand its roots and consequences. These backgrounds are consequently connected and analyzed with a focus on selected countries: the Czech Republic, Slovakia, and Australia. The impacts of the governmental initiatives on the private sector, key findings and recommendations, are summarised in the last chapter of the paper.private sector, initiative, economy, Czech Republic, crisis
    corecore