54 research outputs found

    Multivariate Adaptive Regression Splines (MARS) for Modeling Student Status at Universitas Terbuka

    Get PDF
    Multivariate Adaptive Regression Splines (MARS) used to model the active student’s status in the Department of Statistics at Universitas Terbuka and determine the factors that influence the response variable. This study consists of 9 variables, namely gender, age, education, marital status, job, initial registration year, number of registrations, credits, and GPA, but after modeling using the MARS method, the explanatory variable can affect the response variable is the initial registration year. Several registrations, GPA, and credits. Based on the results of the R output and using a 95% confidence interval, each base 1 to 10 function is partially significant with the p-value of the base 1-10 function being smaller than 0.05 and simultaneously with a smaller p-value. of 0.05, so that the above model has a significant effect partially or simultaneously on the response variable. From these results, it is concluded that the MARS model is suitable for determining the factors that affect the active status of students

    Model Penilaian Kredit Menggunakan Analisis Diskriminan dengan Variabel Bebas Campuran Biner dan Kontinu

    Full text link
    Credit scoring models is an important tools in the credit granting process. These models measure the credit risk of a prospective client. This study aims to applied a discriminant model with mixed predictor variables (binary and continuous) for credit assesment. Implementation of the model use debitur characteristics data from a bank in Lampung Province which the used binary variables involve sex and marital status. Whereas, the continuous variables that was considered appropriate in the model are age, net income, and length of work. By using the data training, it was known that the misclassification of the model is 0.1970 and the misclassification of the testing data reach to 0.3753

    CREDIT SCORING MENGGUNAKAN ALGORITMA CLASSIFICATION AND REGRESSION TREE (CART) DAN ARTIFICIAL BEE COLONY

    Get PDF
    Credit scoring adalah proses penilaian kredit yang sering dilakukan oleh pihak lembaga keuangan. Melalui proses ini, ditentukan apakah calon debitur yang mengajukan kredit diklasifikasikan sebagai calon debitur yang layak untuk diberikan pinjaman atau sebaliknya. Kesalahan dalam proses credit scoring, pada akhirnya akan mengakibatkan kerugian dari lembaga keuangan tersebut. Kesalahan proses yang umum terjadi adalah kesalahan hasil dari prosedur credit scoring tersebut. Classification and Regression Tree merupakan salah salah satu dari sepuluh algoritma terbaik untuk digunakan di dalamdata mining. kelebihan algoritma ini yang bisa mengatasi data noise. Data noise ini biasanya akan sangat sering terjadi pada data financial. Peneliti akan mencoba untuk menerapkan algoritma CART pada credit scoring, kemudian akan mencoba meningkatkan tingkat akurasi tersebut, dengan proses seleksi attribut / feature dengan menggunakan algoritma Artificial Bee Colony dengan menggunakan public dataset. Perbandingan akan dibuat, untuk mengetahui berapa besar kenaikan persentase akurasi menggunakan algoritma CART, sebelum dan sesudah menggunakan seleksi attribut olehABC

    The credit risk evaluation models: an application of data mining techniques

    Get PDF
    In the banking sector, credit risk assessment is an important operation in ensuring that loans could be paid on time, and banks could maintain their credit performance effectively; despite restless business efforts allocated to credit scoring yearly, high percentage of loan defaulting remains a major issue. With the availability of tremendous banking data and advanced analytics tools, classification data mining algorithms can be applied to develop a platform of credit scoring and to resolve the loan defaulting problem. With the dataset of 5,960 observations representing information about characteristics of underlying-collateral loans, the paper sets out a data mining process to compare four classification algorithms, including logistic regression, decision tree, neural network, and XGboost in performance. Via the confusion matrix and Monte Carlo simulation benchmarks, the XGboost outperforms as the most accurate and profitable model, displaying a high consistency about the major factors which could be attributable for default possibilities of the credit scoring

    Application of the CART Model to Classify the Perception of Young Canadian Teenagers on the Effect of Marijuana on Health

    Get PDF
    The use of the illicit drug, marijuana has increased over years among young teenagers in different parts of the world and its harm on the health is generally well-known. This paper attempts to study the perception of young adolescents of 13-15 years old residing in Canada towards the danger of marijuana on health. To do so, a classification and regression tree (CART) has been applied on the data from the 2012 National Anti-Drug Strategy (NADS) Youth Advertising Recall and Tracking Survey. The decision tree has been applied and pruned on a training data set (70%) and evaluated on the testing data set (30%). The results show that the main indicators which impact on the perception of a teenager towards the harm marijuana has on health are the perceptions towards psilocybin (another illicit drug), the province in which the teenager lives and whether he/she has been ever offered drugs. The overall error rate on the testing data set based on the confusion matrix is less than 20% and the area under the ROC curve is relatively high showing that the model is accurate in classifying the perception of young teenagers on the health marijuana has on health

    The Impacts of Machine Learning in Financial Crisis Prediction

    Get PDF
    The most complicated and expected issue to be handled in corporate firms, small-scale businesses, and investors’ even governments are financial crisis prediction. To this effect, it was of interest to us to investigate the current impact of the newly employed technique that is machine learning (ML) to handle this menace in all spheres of business both private and public. The study uses systematic literature assessment to study the impact of ML in financial crisis prediction. From the selected works of literature, we have been able to establish the important role play by this method in the prediction of bankruptcy and creditworthiness that was not handled appropriately by others method. Also, machine learning helps in data handling, data privacy, and confidentiality. This study presents a leading approach to achieving financial growth and plasticity in corporate organizations. We, therefore, recommend a real-time study to investigate the impact of ML in FCP. &nbsp

    A Decision Tree Scoring Model Based on Genetic Algorithm and K-means Algorithm

    Get PDF
    Credit scoring has been regarded as a critical topic and studied extensively in the finance field. Many artificial intelligence techniques have been used to solve credit scoring. The paper is to build a classification model based on a decision tree by learning historical data. Clustering algorithm and genetic algorithm are combined to further improve the accuracy of this credit scoring model. The clustering algorithm aims at removing noise data, while the genetic algorithm is used to reduce the redundancy attribute of data. The computational results on the two real world benchmark data sets show that the presented hybrid model is efficient

    The Application of Target Analysis in Electricity Demand-Side Management

    Get PDF
    Recently, target analysis combined with database technology and data mining has been widely used in industries such as marketing, finance, insurance, telecommunications, advertising, and e-commerce. Because of the unique complexities of user behavior in electricity demand, examples of target analysis applications have yet to be seen. Considering the industry’s urgent need to enhance the efficiency of electricity demand-side management, this study aims to build a mining analysis model for potential target users of interruptible load that both fully reflects consumer behavior characteristics and serves as a rule for static comparisons. The results of a data mining analysis of the Taiwan Power Company (Taipower)’s interruptible loads 1 to 6 show that the number of potential target users is 1669, which is 21% of the original mining population. Additionally, the target users who were classified to have “the most potential” for all categories of interruptible load only accounted for 0.76% of the total mining population (= 59/7814), verifying the mining effects
    • …
    corecore