4,444 research outputs found

    Student Attrition Prediction Using Machine Learning Techniques

    Get PDF
    In educational systems, students’ course enrollment is fundamental performance metrics to academic and financial sustainability. In many higher institutions today, students’ attrition rates are caused by a variety of circumstances, including demographic and personal factors such as age, gender, academic background, financial abilities, and academic degree of choice. In this study, machine learning approaches was used to develop prediction models that predicted students’ attrition rate in pursuing computer science degree, as well as students who have a high risk of dropping out before graduation. This can help higher education institutes to develop proper intervention plans to reduce attrition rates and increase the probability of student academic success. Student’s data were collected from the Federal University Lokoja (FUL), Nigeria. The data were preprocessed using existing weka machine learning libraries where the data was converted into attribute related file form (arff) and resampling techniques was used to partition the data into training set and testing set. The correlation-based feature selection was extracted and used to develop the students’ attrition model and to identify the students’ risk of dropping out. Random forest and random tree machine learning algorithms were used to predict students' attrition. The results showed that the random forest had an accuracy of 79.45%, while the random tree's accuracy was 78.09%. This is an improvement over previous results where 66.14% and 57.48% accuracy was recorded for random forest and random tree respectively. This improvement was as a result of the techniques used. It is therefore recommended that applying techniques to the classification model can improve the performance of the model

    Measuring Students’ Performance with Data Mining

    Get PDF
    Understanding the true reasons behind students’ failure, and bringing preventive measures to this issue at early stages are invaluable in the educational learning process. Preventing problems such as language deficiency or misclassification of the students in the appropriate academic levels is primordial for any educational institution. Many factors influence the learning process of the students, such as the demographic characteristics, educational background as well as language barrier. This work highlights the most preponderant factors affecting students’ advancement in the learning process and provides support to academic administrators. It uses some of state of the art classification and regression algorithms in the application domain of predicting students’ progress. Datasets were filtered and trained using predictive algorithms. It is shown that Science learning and English language skills are highly correlated. Datasets are not always suitable for data mining unless it is preprocessed and well adapted to the context being studied. A tool has been developed to preprocess the data provided that feeds into Weka Data Mining Software to profile students’ performance

    Using Admissions Data to Create a First-Semester Academic Success Model

    Get PDF
    Higher education is attracting more students from diverse background especially at public community colleges. These institutions can help these students attain a quality education at a reasonable price. Unfortunately, community colleges have lower graduation rates than 4-year institutions in part due to the diverse needs and variety in academic preparedness amongst their populations. It can be difficult to identify students most at risk of performing poorly until it is too late. There are multiple ways to predict students’ performance. In this study, three common data mining techniques are compared for their accuracy in predicting academic success using only data collected at the point of admissions. Accurate early prediction can allow academic support professionals to intervene and provide intrusive assistance. A neural network model was found to be more accurate than logistic regression and decision tree models. Moreover, data elements of high school GPA, age, and sex were the most important factors in the neural network model

    Applying Academic Analytics: Developing a Process for Utilizing Bayesian Networks to Predict Stopping Out Among Community College Students

    Get PDF
    abstract: Many methodological approaches have been utilized to predict student retention and persistence over the years, yet few have utilized a Bayesian framework. It is believed this is due in part to the absence of an established process for guiding educational researchers reared in a frequentist perspective into the realms of Bayesian analysis and educational data mining. The current study aimed to address this by providing a model-building process for developing a Bayesian network (BN) that leveraged educational data mining, Bayesian analysis, and traditional iterative model-building techniques in order to predict whether community college students will stop out at the completion of each of their first six terms. The study utilized exploratory and confirmatory techniques to reduce an initial pool of more than 50 potential predictor variables to a parsimonious final BN with only four predictor variables. The average in-sample classification accuracy rate for the model was 80% (Cohen's κ = 53%). The model was shown to be generalizable across samples with an average out-of-sample classification accuracy rate of 78% (Cohen's κ = 49%). The classification rates for the BN were also found to be superior to the classification rates produced by an analog frequentist discrete-time survival analysis model.Dissertation/ThesisDoctoral Dissertation Educational Psychology 201

    Investigating prediction modelling of academic performance for students in rural schools in Kenya

    Get PDF
    Academic performance prediction modelling provides an opportunity for learners' probable outcomes to be known early, before they sit for final examinations. This would be particularly useful for education stakeholders to initiate intervention measures to help students who require high intervention to pass final examinations. However, limitations of infrastructure in rural areas of developing countries, such as lack of or unstable electricity and Internet, impede the use of PCs. This study proposed that an academic performance prediction model could include a mobile phone interface specifically designed based on users' needs. The proposed mobile academic performance prediction system (MAPPS) could tackle the problem of underperformance and spur development in the rural areas. A six-step Cross-Industry Standard Process for Data Mining (CRISP-DM) theoretical framework was used to support the design of MAPPS. Experiments were conducted using two datasets collected in Kenya. One dataset had 2426 records of student data having 22 features, collected from 54 rural primary schools. The second dataset had 1105 student records with 19 features, collected from 11 peri-urban primary schools. Evaluation was conducted to investigate: (i) which is the best classifier model among the six common classifiers selected for the type of data used in this study; (ii) what is the optimal subset of features from the total number of features for both rural and peri-urban datasets; and (iii) what is the predictive performance of the Mobile Academic Performance Prediction System in classifying the high intervention class. It was found that the system achieved an F-Measure rate of nearly 80% in determining the students who need high intervention two years before the final examination. It was also found that the system was useful and usable in rural environments; the accuracy of prediction was good enough to motivate stakeholders to initiate strategic intervention measures. This study provides experimental evidence that Educational Data Mining (EDM) techniques can be used in the developing world by exploiting the ubiquitous mobile technology for student academic performance prediction

    Supervised Learning Algorithms in Educational Data Mining: A Systematic Review

    Get PDF
    The academic institutions always looking for tools that improve their performance and enhance individuals outcomes. Due to the huge ability of data mining to explore hidden patterns and trends in the data, many researchers paid attention to Educational Data Mining (EDM) in the last decade. This field explores different types of data using different algorithms to extract knowledge that supports decision-making and academic sector development. The researchers in the field of EDM have proposed and adopted different algorithms in various directions. In this review, we have explored the published papers between 2010-2020 in the libraries (IEEE, ACM, Science Direct, and Springer) in the field of EDM are to answer review questions. We aimed to find the most used algorithm by researchers in the field of supervised machine learning in the period of 2010-2020. Additionally, we explored the most direction in the EDM and the interest of the researchers. During our research and analysis, many limitations have been examined and in addition to answering the review questions, some future works have been presented

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    BUILDING DSS USING KNOWLEDGE DISCOVERY IN DATABASE APPLIED TO ADMISSION & REGISTRATION FUNCTIONS

    Get PDF
    This research investigates the practical issues surrounding the development and implementation of Decision Support Systems (DSS). The research describes the traditional development approaches analyzing their drawbacks and introduces a new DSS development methodology. The proposed DSS methodology is based upon four modules; needs' analysis, data warehouse (DW), knowledge discovery in database (KDD), and a DSS module. The proposed DSS methodology is applied to and evaluated using the admission and registration functions in Egyptian Universities. The research investigates the organizational requirements that are required to underpin these functions in Egyptian Universities. These requirements have been identified following an in-depth survey of the recruitment process in the Egyptian Universities. This survey employed a multi-part admission and registration DSS questionnaire (ARDSSQ) to identify the required data sources together with the likely users and their information needs. The questionnaire was sent to senior managers within the Egyptian Universities (both private and government) with responsibility for student recruitment, in particular admission and registration. Further, access to a large database has allowed the evaluation of the practical suitability of using a data warehouse structure and knowledge management tools within the decision making framework. 1600 students' records have been analyzed to explore the KDD process, and another 2000 records have been used to build and test the data mining techniques within the KDD process. Moreover, the research has analyzed the key characteristics of data warehouses and explored the advantages and disadvantages of such data structures. This evaluation has been used to build a data warehouse for the Egyptian Universities that handle their admission and registration related archival data. The decision makers' potential benefits of the data warehouse within the student recruitment process will be explored. The design of the proposed admission and registration DSS (ARDSS) will be developed and tested using Cool: Gen (5.0) CASE tools by Computer Associates (CA), connected to a MSSQL Server (6.5), in a Windows NT (4.0) environment. Crystal Reports (4.6) by Seagate will be used as a report generation tool. CLUST AN Graphics (5.0) by CLUST AN software will also be used as a clustering package. Finally, the contribution of this research is found in the following areas: A new DSS development methodology; The development and validation of a new research questionnaire (i.e. ARDSSQ); The development of the admission and registration data warehouse; The evaluation and use of cluster analysis proximities and techniques in the KDD process to find knowledge in the students' records; And the development of the ARDSS software that encompasses the advantages of the KDD and DW and submitting these advantages to the senior admission and registration managers in the Egyptian Universities. The ARDSS software could be adjusted for usage in different countries for the same purpose, it is also scalable to handle new decision situations and can be integrated with other systems

    Assessing and classification of academic efficiency in engineering teaching programs

    Get PDF
    This research uses a three-phase method to evaluate and forecast the academic efficiency of engineering programs. In the first phase, university profiles are created through cluster analysis. In the second phase, the academic efficiency of these profiles is evaluated through Data Envelopment Analysis. Finally, a machine learning model is trained and validated to forecast the categories of academic efficiency. The study population corresponds to 256 university engineering programs in Colombia and the data correspond to the national examination of the quality of education in Colombia in 2018. In the results, two university profiles were identified with efficiency levels of 92.3% and 97.3%, respectively. The Random Forest model presents an Area under ROC value of 95.8% in the prediction of the efficiency profiles. The proposed structure evaluates and predicts university programs’ academic efficiency, evaluating the efficiency between institutions with similar characteristics, avoiding a negative bias toward those institutions that host students with low educational levels

    A methodology to predict community college STEM student retention and completion

    Get PDF
    Numerous government reports point to the multifaceted issues facing the country\u27s capacity to increase the number of STEM majors, while also diversifying the workforce. Community colleges are uniquely positioned as integral partners in the higher education ecosystem. These institutions serve as an access point to opportunity for many students, especially underrepresented minorities and women. Community colleges should serve as a major pathway to students pursuing STEM degrees; however student retention and completion rates are dismally low. Therefore, there is a need to predict STEM student success and provide interventions when factors indicate potential failure. This enables educational institutions to better advise and support students in a more intentional and efficient manner. The objective of this research was to develop a model for predicting success. The methodology uses the Mahalanobis Taguchi System as a novel approach to pattern recognition and gives insight into the ability of MTS to predict outcomes based on student demographic data and academic performance. The method accurately predicts institution-specific risk factors that can be used to better retain STEM students. The research indicates the importance of using community college student data to target this distinctive student population that has demonstrated risk factors outside of the previously reported factors in prior research. This methodology shows promise as a mechanism to close the achievement gap and maximize the power of open-access community college pathways for STEM majors --Abstract, page iv
    • …
    corecore