538 research outputs found

    Predicting Paid Certification in Massive Open Online Courses

    Get PDF
    Massive open online courses (MOOCs) have been proliferating because of the free or low-cost offering of content for learners, attracting the attention of many stakeholders across the entire educational landscape. Since 2012, coined as “the Year of the MOOCs”, several platforms have gathered millions of learners in just a decade. Nevertheless, the certification rate of both free and paid courses has been low, and only about 4.5–13% and 1–3%, respectively, of the total number of enrolled learners obtain a certificate at the end of their courses. Still, most research concentrates on completion, ignoring the certification problem, and especially its financial aspects. Thus, the research described in the present thesis aimed to investigate paid certification in MOOCs, for the first time, in a comprehensive way, and as early as the first week of the course, by exploring its various levels. First, the latent correlation between learner activities and their paid certification decisions was examined by (1) statistically comparing the activities of non-paying learners with course purchasers and (2) predicting paid certification using different machine learning (ML) techniques. Our temporal (weekly) analysis showed statistical significance at various levels when comparing the activities of non-paying learners with those of the certificate purchasers across the five courses analysed. Furthermore, we used the learner’s activities (number of step accesses, attempts, correct and wrong answers, and time spent on learning steps) to build our paid certification predictor, which achieved promising balanced accuracies (BAs), ranging from 0.77 to 0.95. Having employed simple predictions based on a few clickstream variables, we then analysed more in-depth what other information can be extracted from MOOC interaction (namely discussion forums) for paid certification prediction. However, to better explore the learners’ discussion forums, we built, as an original contribution, MOOCSent, a cross- platform review-based sentiment classifier, using over 1.2 million MOOC sentiment-labelled reviews. MOOCSent addresses various limitations of the current sentiment classifiers including (1) using one single source of data (previous literature on sentiment classification in MOOCs was based on single platforms only, and hence less generalisable, with relatively low number of instances compared to our obtained dataset;) (2) lower model outputs, where most of the current models are based on 2-polar iii iv classifier (positive or negative only); (3) disregarding important sentiment indicators, such as emojis and emoticons, during text embedding; and (4) reporting average performance metrics only, preventing the evaluation of model performance at the level of class (sentiment). Finally, and with the help of MOOCSent, we used the learners’ discussion forums to predict paid certification after annotating learners’ comments and replies with the sentiment using MOOCSent. This multi-input model contains raw data (learner textual inputs), sentiment classification generated by MOOCSent, computed features (number of likes received for each textual input), and several features extracted from the texts (character counts, word counts, and part of speech (POS) tags for each textual instance). This experiment adopted various deep predictive approaches – specifically that allow multi-input architecture - to early (i.e., weekly) investigate if data obtained from MOOC learners’ interaction in discussion forums can predict learners’ purchase decisions (certification). Considering the staggeringly low rate of paid certification in MOOCs, this present thesis contributes to the knowledge and field of MOOC learner analytics with predicting paid certification, for the first time, at such a comprehensive (with data from over 200 thousand learners from 5 different discipline courses), actionable (analysing learners decision from the first week of the course) and longitudinal (with 23 runs from 2013 to 2017) scale. The present thesis contributes with (1) investigating various conventional and deep ML approaches for predicting paid certification in MOOCs using learner clickstreams (Chapter 5) and course discussion forums (Chapter 7), (2) building the largest MOOC sentiment classifier (MOOCSent) based on learners’ reviews of the courses from the leading MOOC platforms, namely Coursera, FutureLearn and Udemy, and handles emojis and emoticons using dedicated lexicons that contain over three thousand corresponding explanatory words/phrases, (3) proposing and developing, for the first time, multi-input model for predicting certification based on the data from discussion forums which synchronously processes the textual (comments and replies) and numerical (number of likes posted and received, sentiments) data from the forums, adapting the suitable classifier for each type of data as explained in detail in Chapter 7

    Chatbots for Modelling, Modelling of Chatbots

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informåtica. Fecha de Lectura: 28-03-202

    Digital agriculture: research, development and innovation in production chains.

    Get PDF
    Digital transformation in the field towards sustainable and smart agriculture. Digital agriculture: definitions and technologies. Agroenvironmental modeling and the digital transformation of agriculture. Geotechnologies in digital agriculture. Scientific computing in agriculture. Computer vision applied to agriculture. Technologies developed in precision agriculture. Information engineering: contributions to digital agriculture. DIPN: a dictionary of the internal proteins nanoenvironments and their potential for transformation into agricultural assets. Applications of bioinformatics in agriculture. Genomics applied to climate change: biotechnology for digital agriculture. Innovation ecosystem in agriculture: Embrapa?s evolution and contributions. The law related to the digitization of agriculture. Innovating communication in the age of digital agriculture. Driving forces for Brazilian agriculture in the next decade: implications for digital agriculture. Challenges, trends and opportunities in digital agriculture in Brazil

    The text classification pipeline: Starting shallow, going deeper

    Get PDF
    An increasingly relevant and crucial subfield of Natural Language Processing (NLP), tackled in this PhD thesis from a computer science and engineering perspective, is the Text Classification (TC). Also in this field, the exceptional success of deep learning has sparked a boom over the past ten years. Text retrieval and categorization, information extraction and summarization all rely heavily on TC. The literature has presented numerous datasets, models, and evaluation criteria. Even if languages as Arabic, Chinese, Hindi and others are employed in several works, from a computer science perspective the most used and referred language in the literature concerning TC is English. This is also the language mainly referenced in the rest of this PhD thesis. Even if numerous machine learning techniques have shown outstanding results, the classifier effectiveness depends on the capability to comprehend intricate relations and non-linear correlations in texts. In order to achieve this level of understanding, it is necessary to pay attention not only to the architecture of a model but also to other stages of the TC pipeline. In an NLP framework, a range of text representation techniques and model designs have emerged, including the large language models. These models are capable of turning massive amounts of text into useful vector representations that effectively capture semantically significant information. The fact that this field has been investigated by numerous communities, including data mining, linguistics, and information retrieval, is an aspect of crucial interest. These communities frequently have some overlap, but are mostly separate and do their research on their own. Bringing researchers from other groups together to improve the multidisciplinary comprehension of this field is one of the objectives of this dissertation. Additionally, this dissertation makes an effort to examine text mining from both a traditional and modern perspective. This thesis covers the whole TC pipeline in detail. However, the main contribution is to investigate the impact of every element in the TC pipeline to evaluate the impact on the final performance of a TC model. It is discussed the TC pipeline, including the traditional and the most recent deep learning-based models. This pipeline consists of State-Of-The-Art (SOTA) datasets used in the literature as benchmark, text preprocessing, text representation, machine learning models for TC, evaluation metrics and current SOTA results. In each chapter of this dissertation, I go over each of these steps, covering both the technical advancements and my most significant and recent findings while performing experiments and introducing novel models. The advantages and disadvantages of various options are also listed, along with a thorough comparison of the various approaches. At the end of each chapter, there are my contributions with experimental evaluations and discussions on the results that I have obtained during my three years PhD course. The experiments and the analysis related to each chapter (i.e., each element of the TC pipeline) are the main contributions that I provide, extending the basic knowledge of a regular survey on the matter of TC.An increasingly relevant and crucial subfield of Natural Language Processing (NLP), tackled in this PhD thesis from a computer science and engineering perspective, is the Text Classification (TC). Also in this field, the exceptional success of deep learning has sparked a boom over the past ten years. Text retrieval and categorization, information extraction and summarization all rely heavily on TC. The literature has presented numerous datasets, models, and evaluation criteria. Even if languages as Arabic, Chinese, Hindi and others are employed in several works, from a computer science perspective the most used and referred language in the literature concerning TC is English. This is also the language mainly referenced in the rest of this PhD thesis. Even if numerous machine learning techniques have shown outstanding results, the classifier effectiveness depends on the capability to comprehend intricate relations and non-linear correlations in texts. In order to achieve this level of understanding, it is necessary to pay attention not only to the architecture of a model but also to other stages of the TC pipeline. In an NLP framework, a range of text representation techniques and model designs have emerged, including the large language models. These models are capable of turning massive amounts of text into useful vector representations that effectively capture semantically significant information. The fact that this field has been investigated by numerous communities, including data mining, linguistics, and information retrieval, is an aspect of crucial interest. These communities frequently have some overlap, but are mostly separate and do their research on their own. Bringing researchers from other groups together to improve the multidisciplinary comprehension of this field is one of the objectives of this dissertation. Additionally, this dissertation makes an effort to examine text mining from both a traditional and modern perspective. This thesis covers the whole TC pipeline in detail. However, the main contribution is to investigate the impact of every element in the TC pipeline to evaluate the impact on the final performance of a TC model. It is discussed the TC pipeline, including the traditional and the most recent deep learning-based models. This pipeline consists of State-Of-The-Art (SOTA) datasets used in the literature as benchmark, text preprocessing, text representation, machine learning models for TC, evaluation metrics and current SOTA results. In each chapter of this dissertation, I go over each of these steps, covering both the technical advancements and my most significant and recent findings while performing experiments and introducing novel models. The advantages and disadvantages of various options are also listed, along with a thorough comparison of the various approaches. At the end of each chapter, there are my contributions with experimental evaluations and discussions on the results that I have obtained during my three years PhD course. The experiments and the analysis related to each chapter (i.e., each element of the TC pipeline) are the main contributions that I provide, extending the basic knowledge of a regular survey on the matter of TC

    Digital agriculture: research, development and innovation in production chains.

    Get PDF
    Digital transformation in the field towards sustainable and smart agriculture. Digital agriculture: definitions and technologies. Agroenvironmental modeling and the digital transformation of agriculture. Geotechnologies in digital agriculture. Scientific computing in agriculture. Computer vision applied to agriculture. Technologies developed in precision agriculture. Information engineering: contributions to digital agriculture. DIPN: a dictionary of the internal proteins nanoenvironments and their potential for transformation into agricultural assets. Applications of bioinformatics in agriculture. Genomics applied to climate change: biotechnology for digital agriculture. Innovation ecosystem in agriculture: Embrapa?s evolution and contributions. The law related to the digitization of agriculture. Innovating communication in the age of digital agriculture. Driving forces for Brazilian agriculture in the next decade: implications for digital agriculture. Challenges, trends and opportunities in digital agriculture in Brazil.Translated by Beverly Victoria Young and Karl Stephan Mokross

    "A voice cries out in the wilderness”: The religio-political afterlife of ‘scapegoat’ and ‘messiah’ metaphors, from the Hebrew Bible to contemporary Australian political rhetoric

    Get PDF
    Religion intercedes in the emotive underpinnings of politics in Australia far more than we realize. It is argued in colloquial settings that religion is only as relevant to Australian politics as it is to Australian culture which, according to census data, is increasingly irreligious. In contrast, scholars and commentators argue that religion is evident in politics through our colonial history, the ‘religious right’ and so-called ‘dog-whistle’ politics (Maddox 2001; 2005; Hindess 2014b; Lake 2018; Sheridan 2018). However, what is not known is why Judeo-Christian religious motifs and undertones are so persistent in political speechmaking and how they maintain their persuasiveness when Christianity has declined so significantly. Accordingly, this thesis examines the Biblical Hebrew origins and translation of two important Biblical metaphors and their continued use in the so-called ‘Afterlife of the Text’ (Benjamin 1968; Sawyer 1995; Kugel 2007; Sawyer 2018). This ‘afterlife’ includes use in contemporary Australian prime ministerial speeches. Using Conceptual Metaphor Theory (Lakoff and Johnson 1980) as an analytical tool, I discuss the origins of the Biblical Hebrew metaphors of ‘scapegoat’ in Leviticus 16 and ‘messiah’ in Isaiah’s ‘Servant Songs’ (Isaiah 42:1–9; 49:1–13; 50:4– 11; 52:13– 53:12). In Leviticus, the conceptual metaphor SCAPEGOAT IS ATONEMENT is evident in the Day of Atonement ritual while in Isaiah’s Servants Songs, the conceptual metaphors of MESSIAH IS GUIDE and MESSIAH IS SACRIFICE are evident. The afterlife of the metaphors evolves beyond their Hebrew Bible origins, including use in Australian prime ministerial speeches from 2000-current. This qualitative analysis focuses on significant prime ministerial speeches: the ‘scapegoat’ in John Howard’s ministerial statement about entering the war in Iraq (Howard 2003) and in Kevin Rudd’s Apology to Australia’s Indigenous Peoples (Rudd 2008), and the ‘messiah’ in Julia Gillard’s so-called ‘misogyny speech’ (Gillard 2012) and in Morrison’s press conference on floods, Parliament House culture and women’s safety (Morrison 2021b). This research shows how these two metaphors are part of the Australian political speechmaking landscape, though far removed from their Ancient Near Eastern origins. If we uncover use of Judeo-Christian metaphors within the political space, we will better be able to understand the use of biblical metaphors in contemporary Australian political speeches.Thesis (Ph.D.) -- University of Adelaide, School of Humanities, 202

    Sustainable Value Co-Creation in Welfare Service Ecosystems : Transforming temporary collaboration projects into permanent resource integration

    Get PDF
    The aim of this paper is to discuss the unexploited forces of user-orientation and shared responsibility to promote sustainable value co-creation during service innovation projects in welfare service ecosystems. The framework is based on the theoretical field of public service logic (PSL) and our thesis is that service innovation seriously requires a user-oriented approach, and that such an approach enables resource integration based on the service-user’s needs and lifeworld. In our findings, we identify prerequisites and opportunities of collaborative service innovation projects in order to transform these projects into sustainable resource integration once they have ended

    LASSO – an observatorium for the dynamic selection, analysis and comparison of software

    Full text link
    Mining software repositories at the scale of 'big code' (i.e., big data) is a challenging activity. As well as finding a suitable software corpus and making it programmatically accessible through an index or database, researchers and practitioners have to establish an efficient analysis infrastructure and precisely define the metrics and data extraction approaches to be applied. Moreover, for analysis results to be generalisable, these tasks have to be applied at a large enough scale to have statistical significance, and if they are to be repeatable, the artefacts need to be carefully maintained and curated over time. Today, however, a lot of this work is still performed by human beings on a case-by-case basis, with the level of effort involved often having a significant negative impact on the generalisability and repeatability of studies, and thus on their overall scientific value. The general purpose, 'code mining' repositories and infrastructures that have emerged in recent years represent a significant step forward because they automate many software mining tasks at an ultra-large scale and allow researchers and practitioners to focus on defining the questions they would like to explore at an abstract level. However, they are currently limited to static analysis and data extraction techniques, and thus cannot support (i.e., help automate) any studies which involve the execution of software systems. This includes experimental validations of techniques and tools that hypothesise about the behaviour (i.e., semantics) of software, or data analysis and extraction techniques that aim to measure dynamic properties of software. In this thesis a platform called LASSO (Large-Scale Software Observatorium) is introduced that overcomes this limitation by automating the collection of dynamic (i.e., execution-based) information about software alongside static information. It features a single, ultra-large scale corpus of executable software systems created by amalgamating existing Open Source software repositories and a dedicated DSL for defining abstract selection and analysis pipelines. Its key innovations are integrated capabilities for searching for selecting software systems based on their exhibited behaviour and an 'arena' that allows their responses to software tests to be compared in a purely data-driven way. We call the platform a 'software observatorium' since it is a place where the behaviour of large numbers of software systems can be observed, analysed and compared

    Data Rescue : defining a comprehensive workflow that includes the roles and responsibilities of the research library.

    Get PDF
    Thesis (PhD (Research))--University of Pretoria, 2023.This study, comprising a case study at a selected South African research institute, focused on the creation of a workflow model for data rescue indicating the roles and responsibilities of the research library. Additional outcomes of the study include a series of recommendations addressing the troublesome findings that revealed data at risk to be a prevalent reality at the selected institute, showing the presence of a multitude of factors putting data at risk, disclosing the profusion of data rescue obstacles faced by researchers, and uncovering that data rescue at the institute is rarely implemented. The study consists of four main parts: (i) a literature review, (ii) content analysis of literature resulting in the creation of a data rescue workflow model, (iii) empirical data collection methods , and (iv) the adaptation and revision of the initial data rescue model to present a recommended version of the model. A literature review was conducted and addressed data at risk and data rescue terminology, factors putting data at risk, the nature, diversity and prevalence of data rescue projects, and the rationale for data rescue. The second part of the study entailed the application of content analysis to selected documented data rescue workflows, guidelines and models. Findings of the analysis led to the identification of crucial components of data rescue and brought about the creation of an initial Data Rescue Workflow Model. As a first draft of the model, it was crucial that the model be reviewed by institutional research experts during the next main stage of the study. The section containing the study methodology culminates in the implementation of four different empirical data collection methods. Data collected via a web-based questionnaire distributed to a sample of research group leaders (RGLs), one-on-one virtual interviews with a sample of the aforementioned RGLs, feedback supplied by RGLs after reviewing the initial Data Rescue Workflow Model, and a focus group session held with institutional research library experts resulted in findings producing insight into the institute’s data at risk and the state of data rescue. Feedback supplied by RGLs after examining the initial Data Rescue Workflow Model produced a list of concerns linked to the model and contained suggestions for changes to the model. RGL feedback was at times unrelated to the model or to data and necessitated the implementation of a mini focus group session involving institutional research library experts. The mini focus group session comprised discussions around requirements for a data rescue workflow model. The consolidation of RGL feedback and feedback supplied by research library experts enabled the creation of a recommended Data Rescue Workflow Model, with the model also indicating the various roles and responsibilities of the research library. The contribution of this research lies primarily in the increase in theoretical knowledge regarding data at risk and data rescue, and culminates in the presentation of a recommended Data Rescue Workflow Model. The model not only portrays crucial data rescue activities and outputs, but also indicates the roles and responsibilities of a sector that can enhance and influence the prevalence and execution of data rescue projects. In addition, participation in data rescue and an understanding of the activities and steps portrayed via the model can contribute towards an increase in the skills base of the library and information services sector and enhance collaboration projects with relevant research sectors. It is also anticipated that the study recommendations and exposure to the model may influence the viewing and handling of data by researchers and accompanying research procedures.Information SciencePhD (Research)Unrestricte

    Operational Research: methods and applications

    Get PDF
    This is the final version. Available on open access from Taylor & Francis via the DOI in this recordThroughout its history, Operational Research has evolved to include methods, models and algorithms that have been applied to a wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first summarises the up-to-date knowledge and provides an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion and used as a point of reference by a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes
    • 

    corecore