2,210 research outputs found

    Deep learning based Arabic short answer grading in serious games

    Get PDF
    Automatic short answer grading (ASAG) has become part of natural language processing problems. Modern ASAG systems start with natural language preprocessing and end with grading. Researchers started experimenting with machine learning in the preprocessing stage and deep learning techniques in automatic grading for English. However, little research is available on automatic grading for Arabic. Datasets are important to ASAG, and limited datasets are available in Arabic. In this research, we have collected a set of questions, answers, and associated grades in Arabic. We have made this dataset publicly available. We have extended to Arabic the solutions used for English ASAG. We have tested how automatic grading works on answers in Arabic provided by schoolchildren in 6th grade in the context of serious games. We found out those schoolchildren providing answers that are 5.6 words long on average. On such answers, deep learning-based grading has achieved high accuracy even with limited training data. We have tested three different recurrent neural networks for grading. With a transformer, we have achieved an accuracy of 95.67%. ASAG for school children will help detect children with learning problems early. When detected early, teachers can solve learning problems easily. This is the main purpose of this research

    Development of an Automated Scoring Model Using SentenceTransformers for Discussion Forums in Online Learning Environments

    Get PDF
    Due to the limitations of public datasets, research on automatic essay scoring in Indonesian has been restrained and resulted in suboptimal accuracy. In general, the main goal of the essay scoring system is to improve execution time, which is usually done manually with human judgment. This study uses a discussion forum in online learning to generate an assessment between the responses and the lecturer\u27s rubric in the automated essay scoring. A SentenceTransformers pre-trained model that can construct the highest vector embedding was proposed to identify the semantic meaning between the responses and the lecturer\u27s rubric. The effectiveness of monolingual and multilingual models was compared. This research aims to determine the model\u27s effectiveness and the appropriate model for the Automated Essay Scoring (AES) used in paired sentence Natural Language Processing tasks. The distiluse-base-multilingual-cased-v1 model, which employed the Pearson correlation method, obtained the highest performance. Specifically, it obtained a correlation value of 0.63 and a mean absolute error (MAE) score of 0.70. It indicates that the overall prediction result is enhanced when compared to the earlier regression task research

    Crowdsourcing in Computer Vision

    Full text link
    Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. In this survey, we describe the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. We begin by discussing data collection on both classic (e.g., object recognition) and recent (e.g., visual story-telling) vision tasks. We then summarize key design decisions for creating effective data collection interfaces and workflows, and present strategies for intelligently selecting the most important data instances to annotate. Finally, we conclude with some thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in Computer Graphics and Vision, 201

    Exploration of aunnotation strategies for entailment-based Automatic Short Answer Grading

    Get PDF
    [EN] Recent work has shown that Automatic Short Answer Grading can effectively be reformulated as a Textual Entailment problem. In this work we show that this reformulation is also effective in zero-shot and few-shot settings, where we report competent results close to state-of-the-art performance with the few-shot setting. More importantly, we show that the annotation strategy can have significant impact on performance. When annotating few examples, empirical results show that increasing the variability on the question side, at cost of decreasing the amount of annotated answers per question, is preferable than having the same number of annotated examples with less questions and more answers. With this annotation strategy, using only the 10% of the full training set our model levels with state-of-the-art systems in the SciEntsBank dataset. Finally, experiments over SciEntsBank and Beetle domains show that the use of out-of-domain annotated question-answer examples can be harmful, concluding that task-aware fine-tuned models obtain significantly lower results compared to task-agnostic general purpose inference models, at least with the domains employed for this work.[EU] Erantzun labur automatikoen sailkapenaren inguruan azken urteetan egindako ikerketek atazaren birformulazio eraginkorra eraikitzea posible dela erakutsi dute, inferentzia testualaren atazarako birformulazioa, bereziki. Gure lan honetan, birformulazioaren eraginkortasuna erakusten da adibide gutxitako eszenarioetan (few-shot) eta adibide gabeko eszenarioetan (zero-shot) ere bai. Are eta garrantzitsuago, atazarako adibideak anotatzeko estrategiak modeloaren erredimenduan eragin nabarmena duela erakusten da. Adibide gutxi batzuk idaztean, emaitza enpirikoek erakusten dute hobe dela galderaren aldeko aldagarritasuna handitzea, galdera bakoitzeko idatzitako erantzun-kopurua murriztearen kostuari dagokionez, galdera gutxiagorekin eta erantzun gehiagorekin idatzitako adibide-kopuru bera izatea baino. Idazteko estrategia honi jarraituz, entrenamendu osoko datu-basearen %10a erabiliz artearen egoerako sistemen errendimenduaren parekoa da, SciEntsBank domeinuko datu-basean. Azkenik, Beetle eta SciEntsBank domeinuen gainean aurrera eramandako esperimentuek domeinuz kanpoko galdera-erantzun adibide bikoteek errendimendurako mingarriak izan daitezkeela erakutsi dute, beste domeinu batetik ataza ezagutzen duten sistemek ataza ezagutzen ez dutenak baino emaitza apalagoak emateko joera dutela ondorioztatuz, aztertutako domeinuetan behintzat

    A Machine Learning Prediction of Automatic Text Based Assessment for Open and Distance Learning: A Review

    Get PDF
    In this systematic literature review, automatic text-based and easy type assessment grading system using Machine Learning and Natural Language Processing (NLP) techniques was investigated. The major focus is on text-based and essay type assessment in ODL courses. Text-based and essay type questions is an important tool for performing quality examination and assessment to help the students gain mastery over the task and widen their horizon of knowledge and increase the learner’s development and learning than, for instance subjective question type, single choice question (SCQ), multiple choice question (MCQ) and true/false question type. Automatic text-based and essay type assessment grading system can be used as an important tool in ODL institutions, where assessment and examination can be quickly and easily evaluated for the purpose of efficient feedback. We carried out this study using quality, exclusion and inclusion criteria by selecting only studies that focuses on NLP and Machine Learning techniques for automatic text-based and essay type assessment grading task. Searches in ACM Digital Library, Semantic Scholar, Scopus, IEEE Xplore, Google Scholar, Microsoft Academic, Learn Tech Library and Springer is performed in order to retrieve important and relevant literature in this research domain. Conference papers, journals and articles between the year 2011 and 2019 were considered in this study. This study found 34 published articles describing automatic text-based and essay type assessment and examination grading task out of a total of 1260 articles that met our search criteria

    GuavaNet: A deep neural network architecture for automatic sensory evaluation to predict degree of acceptability for Guava by a consumer

    Get PDF
    This thesis is divided into two parts:Part I: Analysis of Fruits, Vegetables, Cheese and Fish based on Image Processing using Computer Vision and Deep Learning: A Review. It consists of a comprehensive review of image processing, computer vision and deep learning techniques applied to carry out analysis of fruits, vegetables, cheese and fish.This part also serves as a literature review for Part II.Part II: GuavaNet: A deep neural network architecture for automatic sensory evaluation to predict degree of acceptability for Guava by a consumer. This part introduces to an end-to-end deep neural network architecture that can predict the degree of acceptability by the consumer for a guava based on sensory evaluation

    Feature Space Augmentation: Improving Prediction Accuracy of Classical Problems in Cognitive Science and Computer Vison

    Get PDF
    The prediction accuracy in many classical problems across multiple domains has seen a rise since computational tools such as multi-layer neural nets and complex machine learning algorithms have become widely accessible to the research community. In this research, we take a step back and examine the feature space in two problems from very different domains. We show that novel augmentation to the feature space yields higher performance. Emotion Recognition in Adults from a Control Group: The objective is to quantify the emotional state of an individual at any time using data collected by wearable sensors. We define emotional state as a mixture of amusement, anger, disgust, fear, sadness, anxiety and neutral and their respective levels at any time. The generated model predicts an individual’s dominant state and generates an emotional spectrum, 1x7 vector indicating levels of each emotional state and anxiety. We present an iterative learning framework that alters the feature space uniquely to an individual’s emotion perception, and predicts the emotional state using the individual specific feature space. Hybrid Feature Space for Image Classification: The objective is to improve the accuracy of existing image recognition by leveraging text features from the images. As humans, we perceive objects using colors, dimensions, geometry and any textual information we can gather. Current image recognition algorithms rely exclusively on the first 3 and do not use the textual information. This study develops and tests an approach that trains a classifier on a hybrid text based feature space that has comparable accuracy to the state of the art CNN’s while being significantly inexpensive computationally. Moreover, when combined with CNN’S the approach yields a statistically significant boost in accuracy. Both models are validated using cross validation and holdout validation, and are evaluated against the state of the art
    corecore