9 research outputs found

    Predictive Data Mining: Promising Future and Applications

    Get PDF
    Predictive analytics is the branch of data mining concerned with the prediction of future probabilities and trends. The central element of predictive analytics is the predictor, a variable that can be measured for an individual or other entity to predict future behavior. For example, an insurance company is likely to take into account potential driving safety predictors such as age, gender, and driving record when issuing car insurance policies. Multiple predictors are combined into a predictive model, which, when subjected to analysis, can be used to forecast future probabilities with an acceptable level of reliability. In predictive modeling, data is collected, a statistical model is formulated, predictions are made and the model is validated (or revised) as additional data becomes available. Predictive analytics are applied to many research areas, including meteorology, security, genetics, economics, and marketing. In this paper, we have done an extensive study on various predictive techniques with all its future directions and applications in various areas are being explaine

    Comparative analysis of predictive modeling across key Domains: Insights and applications

    Get PDF
    Prediction is widely used for various purposes and in many fields of human activity. The techniques employed for making predictions are a subject of great scientific interest within the research community due to their diversity, level of accuracy, and adaptability to data. The challenge is to determine the factors that affect the choice of an optimal technique suited to each prediction objective. In this article, we conduct a review of models used in the literature to make predictions in different domains to understand the factors influencing the selection of a specific predictive model in relation to their areas of study. A comparative analysis of prediction techniques such as statistical algorithms, Data Mining, and Machine Learning has been performed. It follows that the selection of an adequate prediction technique for the best decision-making should take into account the projection horizon, uncertainty around the prediction, data availability and reliability, and the associated cost of prediction

    Application of Machine Learning in Predicting Performance for Computer Engineering Students: A Case Study

    Get PDF
    The present work proposes the application of machine learning techniques to predict the final grades (FGs) of students based on their historical performance of grades. The proposal was applied to the historical academic information available for students enrolled in the computer engineering degree at an Ecuadorian university. One of the aims of the university’s strategic plan is the development of a quality education that is intimately linked with sustainable development goals (SDGs). The application of technology in teaching–learning processes (Technology-enhanced learning) must become a key element to achieve the objective of academic quality and, as a consequence, enhance or benefit the common good. Today, both virtual and face-to-face educational models promote the application of information and communication technologies (ICT) in both teaching–learning processes and academic management processes. This implementation has generated an overload of data that needs to be processed properly in order to transform it into valuable information useful for all those involved in the field of education. Predicting a student’s performance from their historical grades is one of the most popular applications of educational data mining and, therefore, it has become a valuable source of information that has been used for different purposes. Nevertheless, several studies related to the prediction of academic grades have been developed exclusively for the benefit of teachers and educational administrators. Little or nothing has been done to show the results of the prediction of the grades to the students. Consequently, there is very little research related to solutions that help students make decisions based on their own historical grades. This paper proposes a methodology in which the process of data collection and pre-processing is initially carried out, and then in a second stage, the grouping of students with similar patterns of academic performance was carried out. In the next phase, based on the identified patterns, the most appropriate supervised learning algorithm was selected, and then the experimental process was carried out. Finally, the results were presented and analyzed. The results showed the effectiveness of machine learning techniques to predict the performance of students.This work was supported in part by the Spanish Ministry of Science, Innovation and Universities through the ProjectECLIPSE-UA under Grant RTI2018-094283-B-C32

    The Effect of Green Software: A Study of Impact Factors on the Correctness of Software

    Get PDF
    Unfortunately, sustainability is an issue very poorly used when developing software and hardware systems. Lately, and in order to contribute to the earth sustainability, a new concept emerged named Green software which is computer software that can be developed and used efficiently and effectively with minimal or no impact to the environment. Currently, new teaching methods based on students’ learning process are being developed in the European Higher Education Area. Most of them are oriented to promote students’ interest in the course’s contents and offer personalized feedback. Online judging is a promising method for encouraging students’ participation in the e-learning process, although it still has to be researched and developed to be widely used and in a more efficient way. The great amount of data available in an online judging tool provides the possibility of exploring some of the most indicative attributes (e.g., running time, memory) for learning programming concepts, techniques and languages. So far, the most applied methods for automatically gathering information from the judging systems are based on statistical methods and, although providing reasonable correlations, these methods have not been proven to provide enough information for predicting grades when dealing with a huge amount of data. Therefore, the great novelty of this paper is to develop a data mining approach to predict program correctness as well as the grades of the students’ practices. For this purpose, powerful data mining technologies taken from the artificial intelligence domain have been used. In particular, in this study, we have used logistic regression, decision trees, artificial neural network and support vector machines; which have been properly identified as the most suitable ones for predicting activities in the e-learning domains. The results have achieved an accuracy of around 74%, both in the prediction of the program correctness as well as in the practice grades’ prediction. Another relevant issue provided in this paper is a comparison among these four techniques to obtain the best accuracy in predicting grades based on the availability of data as well as their taxonomy. The Decision Trees classifier has obtained the best confusion matrix, and time and memory efficiency were identified as the most important predictor variables. In view of these results, we can conclude that the development of green software leads programmers to implement correct software.This work has been funded by the Spanish Ministry of Economy and Competitiveness (MINECO/FEDER) under the granted project SEQUOIA-UA (TIN2015-63502-C3-3-R), project GINSENG-UMU (TIN2015-70259-C2-2-R) supported by the Spanish Ministry of Economy, Industry and Competitiveness and European FEDER funds. This work has also been partially funded by University of Alicante, under project GRE14-10 and by the Generalitat Valenciana, Spain, under project GV/2016/087

    Data Mining para modelo predictivo de ventas y servicios de mantenimiento en un concesionario automotriz ligero

    Get PDF
    Últimamente el nivel de competencia entre las empresas del rubro automotriz ligero suele ser muy alto, debido a las diversas estrategias desarrolladas por los competidores. Nuestro estudio busca fortalecer la evaluación de pronósticos que permita mejorar la capacidad de la organización para anticiparse a eventos futuros en los procesos importantes del negocio, tales como las ventas y los servicios de mantenimiento. Para lograr dicho objetivo se consultaron investigaciones relacionadas a técnicas de Data Mining, las cuales realizan un análisis de información bajo un enfoque predictivo. El desarrollo de la investigación involucra diseñar diferentes modelos aplicando métodos como regresiones, redes neuronales y árbol de decisión, a una base de datos histórica de una organización automotriz, realizando previamente una selección de datos mediante técnicas como la matriz de correlación y PCA (Principal Component Analysis). Finalmente, se realiza una evaluación sobre los resultados obtenidos luego de comparar los modelos planteados, donde encontramos para los pronósticos de ventas, el modelo de redes neuronales implementado con PCA obtiene mejores resultados; mientras que, para los pronósticos de servicios de mantenimiento, el modelo predominante es el implementado con Random Forest.Lately the level of competition between companies in the light automotive industry is reaching a very high level, due to the various strategies developed by many competitors. Our study seeks to strengthen the evaluation of forecasts to improve the organization's capability to anticipate future events in important business processes, such as sales and maintenance services. To achieve this objective, investigations related to Data Mining techniques were consulted, in order to perform an information analysis with a predictive approach. Our research involves designing different models applying methods such as regressions, neural networks and decision trees, to a historical database of an automotive organization, previously selecting data using techniques such as the correlation matrix and PCA (Principal Component Analysis). Finally, an evaluation is carried out on the results obtained after comparing the proposed models, where we find out that for sales forecasts, the neural network model implemented with PCA obtains better results; whereas, for maintenance services forecasts, the predominant model is the one implemented with Random Forest

    Extracting information from manufacturing data using data mining methods

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Variable precision rough set theory decision support system: With an application to bank rating prediction

    Get PDF
    This dissertation considers, the Variable Precision Rough Sets (VPRS) model, and its development within a comprehensive software package (decision support system), incorporating methods of re sampling and classifier aggregation. The concept of /-reduct aggregation is introduced, as a novel approach to classifier aggregation within the VPRS framework. The software is applied to the credit rating prediction problem, in particularly, a full exposition of the prediction and classification of Fitch's Individual Bank Strength Ratings (FIBRs), to a number of banks from around the world is presented. The ethos of the developed software was to rely heavily on a simple 'point and click' interface, designed to make a VPRS analysis accessible to an analyst, who is not necessarily an expert in the field of VPRS or decision rule based systems. The development of the software has also benefited from consultations with managers from one of Europe's leading hedge funds, who gave valuable insight, advice and recommendations on what they considered as pertinent issues with regards to data mining, and what they would like to see from a modern data mining system. The elements within the developed software reflect each stage of the knowledge discovery process, namely, pre-processing, feature selection, data mining, interpretation and evaluation. The developed software encompasses three software packages, a pre-processing package incorporating some of the latest pre-processing and feature selection methods a VPRS data mining package, based on a novel "vein graph" interface, which presents the analyst with selectable /-reducts over the domain of / and a third more advanced VPRS data mining package, which essentially automates the vein graph interface for incorporation into a re-sampling environment, and also implements the introduced aggregated /-reduct, developed to optimise and stabilise the predictive accuracy of a set of decision rules induced from the aggregated /-reduct
    corecore