8 research outputs found

    Bayes model for assessing the reading difficulty of English text for English education in Jordan

    Get PDF
    Predicting the reading difficulty level of English texts is a critical process for second language education and assessment. Reading difficulty level is concerned with the problem of matching a reader’s proficiency and the appropriate text. The reading difficulty level or readability assessment is the process for predicting the reading grade level required from an input text or document, which corresponds to the reader and to the materials. Students in Jordan at their academic levels find obstacles in finding relevant readable data for any subject at their levels. This paper is intended to introduce a model that foretells the reading difficulty level of a given text in terms of a student's ability to read and understand English as a non-native English speaker in Jordanian schools. In this paper, Jordanian students were classified into four categories according to their knowledge of English. The prediction of the reading difficulty level is achieved by using a modern statistical model that is situated on the Bayes model. The model compares the given text with some standard predefined text that strongly reflects the ability to read and understand English text. The accuracy of the proposed model was tested using the hold-out method. The overall prediction accuracy was 75.9%

    Applying adaptive learning by integrating semantic and machine learning in proposing student assessment model

    Get PDF
    Adaptive learning is one of the most widely used data driven approach to teaching and it received an increasing attention over the last decade. It aims to meet the student’s characteristics by tailoring learning courses materials and assessment methods. In order to determine the student’s characteristics, we need to detect their learning styles according to visual, auditory or kinaesthetic (VAK) learning style. In this research, an integrated model that utilizes both semantic and machine learning clustering methods is developed in order to cluster students to detect their learning styles and recommend suitable assessment method(s) accordingly. In order to measure the effectiveness of the proposed model, a set of experiments were conducted on real dataset (Open University Learning Analytics Dataset). Experiments showed that the proposed model is able to cluster students according to their different learning activities with an accuracy that exceeds 95% and predict their relative assessment method(s) with an average accuracy equals to 93%

    A hybrid naĂŻve Bayes based on similarity measure to optimize the mixed-data classification

    Get PDF
    In this paper, a hybrid method has been introduced to improve the classification performance of naĂŻve Bayes (NB) for the mixed dataset and multi-class problems. This proposed method relies on a similarity measure which is applied to portions that are not correctly classified by NB. Since the data contains a multi-valued short text with rare words that limit the NB performance, we have employed an adapted selective classifier based on similarities (CSBS) classifier to exceed the NB limitations and included the rare words in the computation. This action has been achieved by transforming the formula from the product of the probabilities of the categorical variable to its sum weighted by numerical variable. The proposed algorithm has been experimented on card payment transaction data that contains the label of transactions: the multi-valued short text and the transaction amount. Based on K-fold cross validation, the evaluation results confirm that the proposed method achieved better results in terms of precision, recall, and F-score compared to NB and CSBS classifiers separately. Besides, the fact of converting a product form to a sum gives more chance to rare words to optimize the text classification, which is another advantage of the proposed method

    A new feature extraction approach based on non linear source separation

    Get PDF
    A new feature extraction approach is proposed in this paper to improve the classification performance in remotely sensed data. The proposed method is based on a primary sources subset (PSS) obtained by nonlinear transform that provides lower space for land pattern recognition. First, the underlying sources are approximated using multilayer neural networks. Given that, Bayesian inferences update unknown sources’ knowledge and model parameters with information’s data. Then, a source dimension minimizing technique is adopted to provide more efficient land cover description. The support vector machine (SVM) scheme is developed by using feature extraction. The experimental results on real multispectral imagery demonstrates that the proposed approach ensures efficient feature extraction by using several descriptors for texture identification and multiscale analysis. In a pixel based approach, the reduced PSS space improved the overall classification accuracy by 13% and reaches 82%. Using texture and multi resolution descriptors, the overall accuracy is 75.87% for the original observations, while using the reduced source space the overall accuracy reaches 81.67% when using jointly wavelet and Gabor transform and 86.67% when using Gabor transform. Thus, the source space enhanced the feature extraction process and allow more land use discrimination than the multispectral observations

    Activity Prediction of Business Process Instances using Deep Learning Techniques

    Get PDF
    The ability to predict the next activity of an ongoing case is becoming increasingly important in today’s businesses. Processes need to be monitored in real-life time in order to predict the remaining time of an open case, or also to be able to detect and prevent anomalies before they have a chance to impact the performances. Moreover, financial regulations and laws are changing, requiring companies' processes to be increasingly transparent. Process mining, supported by deep learning techniques, can improve the results of internal audit activities. The task of predicting the next activity can be used in this context to point out traces at risk that need to be monitored. In this way, the business is aware of the situation and, if possible, can take resolution actions in time. In recent years, this problem has been tackled using deep learning techniques, such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM) neural networks, achieving consistent results. The first contribution of this thesis consists of a generation of a real-life process mining dataset based on the Purchase-to-Pay (P2P) process. The SAP tables structure is taken into account since it is the most popular management software in today's companies. We exploit the simulated dataset to explore modeling techniques and to define the type and the quantity of anomalies. The second contribution of the thesis is an investigation of LSTM neural networks architectures that exploit information from both temporal data and static features, applied to the previously generated dataset. The neural networks are then used to predict future events characteristics of running traces. Finally, real-life application of the results are discussed and future work proposals are presented.The ability to predict the next activity of an ongoing case is becoming increasingly important in today’s businesses. Processes need to be monitored in real-life time in order to predict the remaining time of an open case, or also to be able to detect and prevent anomalies before they have a chance to impact the performances. Moreover, financial regulations and laws are changing, requiring companies' processes to be increasingly transparent. Process mining, supported by deep learning techniques, can improve the results of internal audit activities. The task of predicting the next activity can be used in this context to point out traces at risk that need to be monitored. In this way, the business is aware of the situation and, if possible, can take resolution actions in time. In recent years, this problem has been tackled using deep learning techniques, such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM) neural networks, achieving consistent results. The first contribution of this thesis consists of a generation of a real-life process mining dataset based on the Purchase-to-Pay (P2P) process. The SAP tables structure is taken into account since it is the most popular management software in today's companies. We exploit the simulated dataset to explore modeling techniques and to define the type and the quantity of anomalies. The second contribution of the thesis is an investigation of LSTM neural networks architectures that exploit information from both temporal data and static features, applied to the previously generated dataset. The neural networks are then used to predict future events characteristics of running traces. Finally, real-life application of the results are discussed and future work proposals are presented
    corecore