31 research outputs found

    On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

    Full text link
    We present a new distributed fuzzy partitioning method to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems. The proposed algorithm builds a fixed number of fuzzy sets for all variables and adjusts their shape and position to the real distribution of training data. A two-step process is applied : 1) transformation of the original distribution into a standard uniform distribution by means of the probability integral transform. Since the original distribution is generally unknown, the cumulative distribution function is approximated by computing the q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy partition in the transformed attribute space using a fixed number of equally distributed triangular membership functions. Despite the aforementioned transformation, the definition of every fuzzy set in the original space can be recovered by applying the inverse cumulative distribution function (also known as quantile function). The experimental results reveal that the proposed methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT) induction algorithm to maintain classification accuracy with up to 6 million fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData Congress). arXiv admin note: text overlap with arXiv:1902.0935

    Financial Distress Prediction with Stacking Ensemble Learning

    Get PDF
    Previous studies have used financial ratios extensively to build their predictive model of financial distress. The Altman ratio is the most often used to predict, especially in academic studies. However, the Altman ratio is highly dependent on the validity of the data in financial statements, so other variables are needed to assess the possibility of manipulation of financial statements. None of the previous studies combined the five Altman Ratios with the Beneish M-Score. We use Stacking Ensemble Learning to classify crisis companies and perform a comprehensive analysis. This insight helps the investment public make lending decisions by mixing all the financial indicator information and assessing it carefully based on long-term and short-term conditions and possible manipulation of financial statements

    Assessing Manual Dataset Creation For Xauusd Market Prediction : A Comparative Study Logistic Regression And Decision Tree Model

    Get PDF
    This study aims to develop a simplified dataset for more effective market prediction, focusing on the Forex trading of XAUUSD (Gold/USD). The dataset was gathered from the TradingView platform, covering the period from March 4, 2023, to December 21, 2023. The data collection method involved intensive observation of daily and weekly charts, utilizing Daily and Weekly Moving Average (MA) indicators and the concept of breakout. The analysis focused on measuring the distance between the Daily MA at the beginning and end of the period (start and stop), and utilizing this data for entry strategy in the following three time periods. The trading strategy adopted involves the simultaneous use of Buy and Sell orders, with a Stop Loss (SL) to Take Profit (TP) ratio of 1:2. TP was adjusted to accommodate aggressive price movements, while SL remained constant. The collected data was meticulously recorded and stored in Excel format for further analysis.With the prepared dataset, this research applies two AI models, Logistic Regression and Decision Tree, to predict the best trading decision – Buy or Sell. The study aims not only to create a useful dataset for market prediction but also to compare the effectiveness of two different AI methods in the context of Forex trading of XAUUSD. The results are expected to provide insights into which model is more accurate and efficient in analyzing and predicting market trends, with practical implications for traders and market analysts

    Financial-distress prediction of Islamic banks using tree-based stochastic techniques

    Get PDF
    Purpose Financial distress is a socially and economically important problem that affects companies the world over. Having the power to better understand – and hence aid businesses from failing, has the potential to save not only the company, but also potentially prevent economies from sustained downturn. Although Islamic banks constitute a fraction of total banking assets, their importance have been substantially increasing, as their asset growth rate has surpassed that of conventional banks in recent years. The paper aims to discuss these issues. Design/methodology/approach This paper uses a data set comprising 101 international publicly listed Islamic banks to work on advancing financial distress prediction (FDP) by utilising cutting-edge stochastic models, namely decision trees, stochastic gradient boosting and random forests. The most important variables pertaining to forecasting corporate failure are determined from an initial set of 18 variables. Findings The results indicate that the “Working Capital/Total Assets” ratio is the most crucial variable relating to forecasting financial distress using both the traditional “Altman Z-Score” and the “Altman Z-Score for Service Firms” methods. However, using the “Standardised Profits” method, the “Return on Revenue” ratio was found to be the most important variable. This provides empirical evidence to support the recommendations made by Basel Accords for assessing a bank’s capital risks, specifically in relation to the application to Islamic banking. Originality/value These findings provide a valuable addition to the limited literature surrounding Islamic banking in general, and FDP pertaining to Islamic banking in particular, by showcasing the most pertinent variables in forecasting financial distress so that appropriate proactive actions can be taken. </jats:sec

    Prediction business failure with logit and discriminant analysis: evidence from portuguese hospitality sector

    Get PDF
    O turismo é um setor em forte expansão em Portugal e com um importante contributo na economia portuguesa. Por isso, o equilíbrio financeiro das empresas neste setor reveste-se da maior relevância, quer para os agentes económicos, quer para os decisores políticos. Apesar disso, não existem em Portugal trabalhos de investigação que analisem o fracasso empresarial no setor da hotelaria. O objetivo principal deste estudo é propor um modelo de antecipação do fracasso empresarial, especificamente desenvolvido para o setor hoteleiro português. Para esse efeito, utilizamos uma amostra de empresas do sector à qual aplicamos a análise discriminante e a técnica logit. Os resultados obtidos mostram um elevado grau de ajustamento do modelo aos dados e indicam que o modelo utilizado constitui um importante contributo na definição de políticas macroeconómicas e programas de apoio ao desenvolvimento do turismo, sendo igualmente relevantes para as decisões de investidores e credores.ABSTRACT: In hospitality and tourism sector, the financial stability of companies has been an issue of major concern, both for economic actors and policymakers. Literature offers us a wide and rich diversity of studies on the issue. Regarding to Portugal, there are studies applied to different economic sectors. However, only a little few have investigated about the hospitality industry. The main goal of this paper is to develop econometric and multivariate models for forecasting business failure in the hospitality industry, using the logit and discriminant analysis. The results show a high degree of adjustment of the model.The practical utility of these models is recognized by different users of accounting documents, particularly investors and creditors. These models also make an important contribution for the definition of macroeconomic policies and public funding programs for investment in tourism sector.info:eu-repo/semantics/publishedVersio

    PREDVIĐANJE BANKROTA POMOĆU POLU-PARAMETARSKOG MODELA JEDINSTVENOG INDEKSA

    Get PDF
    Semi-parametric methods are virtually neglected in the bankruptcy prediction literature. This paper compares the logit model, as the standard parametric model for bankruptcy prediction, to the semi-parametric model developed by Klein and Spady (1993). Special care is devoted to the effect of choice-based sampling prediction accuracy. The choice of the sampling and estimation method lead to a similar trade offs. Using choice-based sampling and logit model leads to minimization of risk exposure. Samples unbalanced across groups and the semi-parametric method allow for better overall prediction accuracy and thus profit maximization.Polu-parametarski modeli su doslovno zanemareni u literaturi o predviđanju bankrota. Ovaj rad uspoređuje logit model, kao standardni parametarski model za predviđanje bankrota, sa poluparametarskim modelom kojeg su razvili Klein i Spady (1993). Posebna je pažnja posvećena efektu choice-based uzorkovanja na točnost predviđanja. Odabir metode uzorkovanja i procjene dovele su do sličnih balansiranja (trade offs). Korištenje choice-based uzorkovanja i logit modela dovodi do minimaliziranja rizika. Nebalansirani uzorci i polu-parametarska metoda omogućuju generalno bolju kvalitetu predviđanja te tako i maksimizaciju profita

    Selección de tutores académicos en la educación superior usando árboles de decisión

    Get PDF
    ABSTRACTIn this paper, we present a method for the tutoring process in order to improve academic tutoring in higher education. The method includes identifying the main skills of tutors in an automated manner using decision trees, one of the most used algorithms in the machine learning community for solving several real-world problems with high accuracy. In our study, the decision tree algorithm was able to identify those skills and personal affinities between students and tutors. Experiments were carried out using a data set of 277 students and 19 tutors, which were selected by random sampling and voluntary participation, respectively. Preliminary results show that the most important attributes for tutors are communication, self-direction and digital skills. At the same time, we introduce a tutoring process where the tutor assignment is based on these attributes, assuming that it can help to strengthen the student's skills demanded by today's society. In the same way, the decision tree obtained can be used to create cluster of tutors and clusters of students based on their personal abilities and affinities using other machine learning algorithms. The application of the suggested tutoring process could set the tone to see the tutoring process individually without linking it to processes of academic performance or school dropout.RESUMEN  En este documento se presenta un método para mejorar el proceso de tutoría académica en la educación superior. El método incluye a identificación de las habilidades principales de los tutores de forma automática utilizando el algoritmo árboles de decisión, uno de los algoritmos más utilizados en la comunidad de aprendizaje automático para resolver problemas del mundo real con gran precisión. En el estudio, el algoritmo arboles de decisión fue capaz de identificar las habilidades y afinidades entre estudiantes y tutores. Los experimentos se llevaron a cabo utilizando un conjunto de datos de 277 estudiantes y 19 tutores, mismos que fueron seleccionados por muestreo aleatorio simple y participación voluntaria en el caso de los tutores. Los resultados preliminares muestran que los atributos más importantes para los tutores son la comunicación, la autodirección y las habilidades digitales. Al mismo tiempo, se presenta un proceso de tutoría en el que la asignación del tutor se basa en estos atributos, asumiendo que puede ayudar a fortalecer las habilidades de los estudiantes que demanda la sociedad actual. De la misma forma, el árbol de decisión obtenido se puede utilizar para agrupar a tutores y estudiantes basados en sus habilidades y afinidades personales utilizando otros algoritmos de aprendizaje automático. La aplicación del proceso de tutoría sugerido podría dar la pauta para ver el proceso de tutoría de manera individual sin vincularla a procesos de desempeño académico o deserción escolar.ABSTRACTIn this paper, we present a method for the tutoring process in order to improve academic tutoring in higher education. The method includes identifying the main skills of tutors in an automated manner using decision trees, one of the most used algorithms in the machine learning community for solving several real-world problems with high accuracy. In our study, the decision tree algorithm was able to identify those skills and personal affinities between students and tutors. Experiments were carried out using a data set of 277 students and 19 tutors, which were selected by random sampling and voluntary participation, respectively. Preliminary results show that the most important attributes for tutors are communication, self-direction and digital skills. At the same time, we introduce a tutoring process where the tutor assignment is based on these attributes, assuming that it can help to strengthen the student's skills demanded by today's society. In the same way, the decision tree obtained can be used to create cluster of tutors and clusters of students based on their personal abilities and affinities using other machine learning algorithms. The application of the suggested tutoring process could set the tone to see the tutoring process individually without linking it to processes of academic performance or school dropout

    Default prediction of Spanish companies. A logistic analysis

    Full text link
    Licencia Creative Commons: Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)In the field of credit risk management, the calculation of the probability of default of companies plays a key role. For that reason, bankruptcy prediction of companies has generated extensive research in the past decades. This paper applies one of the most popular techniques, the logistic regression. This technique is extensively used both by professionals and academics and is employed in many studies as a benchmark. Here we will apply it on a vast data base of the Spanish companies and a statistical analysis of the robustness of the model will be undertaken, with very satisfactory results.Bartual Sanfeliu, C.; García García, F.; Guijarro Martínez, F.; Moya Clemente, I. (2013). Default prediction of Spanish companies. A logistic analysis. Intellectual economics. 7(3):333-343. doi:10.13165/IE-13-7-3-05S3333437

    A previsão do fracasso empresarial utilizando a análise discriminante e o logit no setor hoteleiro português

    Get PDF
    O turismo é um setor em forte expansão em Portugal e com um importante contributo na economia portuguesa. Por isso, o equilíbrio financeiro das empresas neste setor reveste-se da maior relevância, quer para os agentes económicos, quer para os decisores políticos. Apesar disso, não existem em Portugal trabalhos de investigação que analisem o fracasso empresarial no setor da hotelaria. O objetivo principal deste estudo é propor um modelo de antecipação do fracasso empresarial, especificamente desenvolvido para o setor hoteleiro português. Para esse efeito, utilizamos uma amostra de empresas do sector à qual aplicamos a análise discriminante e a técnica logit. Os resultados obtidos mostram um elevado grau de ajustamento do modelo aos dados e indicam que o modelo utilizado constitui um importante contributo na definição de políticas macroeconómicas e programas de apoio ao desenvolvimento do turismo, sendo igualmente relevantes para as decisões de investidores e credores.info:eu-repo/semantics/publishedVersio
    corecore