1,333 research outputs found

    Predicting Paid Certification in Massive Open Online Courses

    Get PDF
    Massive open online courses (MOOCs) have been proliferating because of the free or low-cost offering of content for learners, attracting the attention of many stakeholders across the entire educational landscape. Since 2012, coined as “the Year of the MOOCs”, several platforms have gathered millions of learners in just a decade. Nevertheless, the certification rate of both free and paid courses has been low, and only about 4.5–13% and 1–3%, respectively, of the total number of enrolled learners obtain a certificate at the end of their courses. Still, most research concentrates on completion, ignoring the certification problem, and especially its financial aspects. Thus, the research described in the present thesis aimed to investigate paid certification in MOOCs, for the first time, in a comprehensive way, and as early as the first week of the course, by exploring its various levels. First, the latent correlation between learner activities and their paid certification decisions was examined by (1) statistically comparing the activities of non-paying learners with course purchasers and (2) predicting paid certification using different machine learning (ML) techniques. Our temporal (weekly) analysis showed statistical significance at various levels when comparing the activities of non-paying learners with those of the certificate purchasers across the five courses analysed. Furthermore, we used the learner’s activities (number of step accesses, attempts, correct and wrong answers, and time spent on learning steps) to build our paid certification predictor, which achieved promising balanced accuracies (BAs), ranging from 0.77 to 0.95. Having employed simple predictions based on a few clickstream variables, we then analysed more in-depth what other information can be extracted from MOOC interaction (namely discussion forums) for paid certification prediction. However, to better explore the learners’ discussion forums, we built, as an original contribution, MOOCSent, a cross- platform review-based sentiment classifier, using over 1.2 million MOOC sentiment-labelled reviews. MOOCSent addresses various limitations of the current sentiment classifiers including (1) using one single source of data (previous literature on sentiment classification in MOOCs was based on single platforms only, and hence less generalisable, with relatively low number of instances compared to our obtained dataset;) (2) lower model outputs, where most of the current models are based on 2-polar iii iv classifier (positive or negative only); (3) disregarding important sentiment indicators, such as emojis and emoticons, during text embedding; and (4) reporting average performance metrics only, preventing the evaluation of model performance at the level of class (sentiment). Finally, and with the help of MOOCSent, we used the learners’ discussion forums to predict paid certification after annotating learners’ comments and replies with the sentiment using MOOCSent. This multi-input model contains raw data (learner textual inputs), sentiment classification generated by MOOCSent, computed features (number of likes received for each textual input), and several features extracted from the texts (character counts, word counts, and part of speech (POS) tags for each textual instance). This experiment adopted various deep predictive approaches – specifically that allow multi-input architecture - to early (i.e., weekly) investigate if data obtained from MOOC learners’ interaction in discussion forums can predict learners’ purchase decisions (certification). Considering the staggeringly low rate of paid certification in MOOCs, this present thesis contributes to the knowledge and field of MOOC learner analytics with predicting paid certification, for the first time, at such a comprehensive (with data from over 200 thousand learners from 5 different discipline courses), actionable (analysing learners decision from the first week of the course) and longitudinal (with 23 runs from 2013 to 2017) scale. The present thesis contributes with (1) investigating various conventional and deep ML approaches for predicting paid certification in MOOCs using learner clickstreams (Chapter 5) and course discussion forums (Chapter 7), (2) building the largest MOOC sentiment classifier (MOOCSent) based on learners’ reviews of the courses from the leading MOOC platforms, namely Coursera, FutureLearn and Udemy, and handles emojis and emoticons using dedicated lexicons that contain over three thousand corresponding explanatory words/phrases, (3) proposing and developing, for the first time, multi-input model for predicting certification based on the data from discussion forums which synchronously processes the textual (comments and replies) and numerical (number of likes posted and received, sentiments) data from the forums, adapting the suitable classifier for each type of data as explained in detail in Chapter 7

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Mapping the Focal Points of WordPress: A Software and Critical Code Analysis

    Get PDF
    Programming languages or code can be examined through numerous analytical lenses. This project is a critical analysis of WordPress, a prevalent web content management system, applying four modes of inquiry. The project draws on theoretical perspectives and areas of study in media, software, platforms, code, language, and power structures. The applied research is based on Critical Code Studies, an interdisciplinary field of study that holds the potential as a theoretical lens and methodological toolkit to understand computational code beyond its function. The project begins with a critical code analysis of WordPress, examining its origins and source code and mapping selected vulnerabilities. An examination of the influence of digital and computational thinking follows this. The work also explores the intersection of code patching and vulnerability management and how code shapes our sense of control, trust, and empathy, ultimately arguing that a rhetorical-cultural lens can be used to better understand code\u27s controlling influence. Recurring themes throughout these analyses and observations are the connections to power and vulnerability in WordPress\u27 code and how cultural, processual, rhetorical, and ethical implications can be expressed through its code, creating a particular worldview. Code\u27s emergent properties help illustrate how human values and practices (e.g., empathy, aesthetics, language, and trust) become encoded in software design and how people perceive the software through its worldview. These connected analyses reveal cultural, processual, and vulnerability focal points and the influence these entanglements have concerning WordPress as code, software, and platform. WordPress is a complex sociotechnical platform worthy of further study, as is the interdisciplinary merging of theoretical perspectives and disciplines to critically examine code. Ultimately, this project helps further enrich the field by introducing focal points in code, examining sociocultural phenomena within the code, and offering techniques to apply critical code methods

    Subgroup discovery for structured target concepts

    Get PDF
    The main object of study in this thesis is subgroup discovery, a theoretical framework for finding subgroups in data—i.e., named sub-populations— whose behaviour with respect to a specified target concept is exceptional when compared to the rest of the dataset. This is a powerful tool that conveys crucial information to a human audience, but despite past advances has been limited to simple target concepts. In this work we propose algorithms that bring this framework to novel application domains. We introduce the concept of representative subgroups, which we use not only to ensure the fairness of a sub-population with regard to a sensitive trait, such as race or gender, but also to go beyond known trends in the data. For entities with additional relational information that can be encoded as a graph, we introduce a novel measure of robust connectedness which improves on established alternative measures of density; we then provide a method that uses this measure to discover which named sub-populations are more well-connected. Our contributions within subgroup discovery crescent with the introduction of kernelised subgroup discovery: a novel framework that enables the discovery of subgroups on i.i.d. target concepts with virtually any kind of structure. Importantly, our framework additionally provides a concrete and efficient tool that works out-of-the-box without any modification, apart from specifying the Gramian of a positive definite kernel. To use within kernelised subgroup discovery, but also on any other kind of kernel method, we additionally introduce a novel random walk graph kernel. Our kernel allows the fine tuning of the alignment between the vertices of the two compared graphs, during the count of the random walks, while we also propose meaningful structure-aware vertex labels to utilise this new capability. With these contributions we thoroughly extend the applicability of subgroup discovery and ultimately re-define it as a kernel method.Der Hauptgegenstand dieser Arbeit ist die Subgruppenentdeckung (Subgroup Discovery), ein theoretischer Rahmen für das Auffinden von Subgruppen in Daten—d. h. benannte Teilpopulationen—deren Verhalten in Bezug auf ein bestimmtes Targetkonzept im Vergleich zum Rest des Datensatzes außergewöhnlich ist. Es handelt sich hierbei um ein leistungsfähiges Instrument, das einem menschlichen Publikum wichtige Informationen vermittelt. Allerdings ist es trotz bisherigen Fortschritte auf einfache Targetkonzepte beschränkt. In dieser Arbeit schlagen wir Algorithmen vor, die diesen Rahmen auf neuartige Anwendungsbereiche übertragen. Wir führen das Konzept der repräsentativen Untergruppen ein, mit dem wir nicht nur die Fairness einer Teilpopulation in Bezug auf ein sensibles Merkmal wie Rasse oder Geschlecht sicherstellen, sondern auch über bekannte Trends in den Daten hinausgehen können. Für Entitäten mit zusätzlicher relationalen Information, die als Graph kodiert werden kann, führen wir ein neuartiges Maß für robuste Verbundenheit ein, das die etablierten alternativen Dichtemaße verbessert; anschließend stellen wir eine Methode bereit, die dieses Maß verwendet, um herauszufinden, welche benannte Teilpopulationen besser verbunden sind. Unsere Beiträge in diesem Rahmen gipfeln in der Einführung der kernelisierten Subgruppenentdeckung: ein neuartiger Rahmen, der die Entdeckung von Subgruppen für u.i.v. Targetkonzepten mit praktisch jeder Art von Struktur ermöglicht. Wichtigerweise, unser Rahmen bereitstellt zusätzlich ein konkretes und effizientes Werkzeug, das ohne jegliche Modifikation funktioniert, abgesehen von der Angabe des Gramian eines positiv definitiven Kernels. Für den Einsatz innerhalb der kernelisierten Subgruppentdeckung, aber auch für jede andere Art von Kernel-Methode, führen wir zusätzlich einen neuartigen Random-Walk-Graph-Kernel ein. Unser Kernel ermöglicht die Feinabstimmung der Ausrichtung zwischen den Eckpunkten der beiden unter-Vergleich-gestelltenen Graphen während der Zählung der Random Walks, während wir auch sinnvolle strukturbewusste Vertex-Labels vorschlagen, um diese neue Fähigkeit zu nutzen. Mit diesen Beiträgen erweitern wir die Anwendbarkeit der Subgruppentdeckung gründlich und definieren wir sie im Endeffekt als Kernel-Methode neu

    On the Utility of Representation Learning Algorithms for Myoelectric Interfacing

    Get PDF
    Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden

    LOL: A Highly Flexible Framework for Designing Stream Ciphers

    Get PDF
    In this paper, we propose LOL, a general framework for designing blockwise stream ciphers, to achieve ultrafast software implementations for the ubiquitous virtual networks in 5G/6G environments and high-security level for post-quantum cryptography. The LOL framework is structurally strong, and all its components as well as the LOL framework itself enjoy high flexibility with various extensions. Following the LOL framework, we propose new stream cipher designs named LOL-MINI and LOL-DOUBLE with the support of the AES-NI and SIMD instructions: the former applies the basic LOL single mode while the latter uses the extended parallel-dual mode. Both LOL-MINI and LOL-DOUBLE support 256-bit key length and, according to our thorough evaluations, have 256-bit security margins against all existing cryptanalysis methods including differential, linear, integral, etc. The software performances of LOL-MINI and LOL-DOUBLE can reach 89 Gbps and 135 Gbps. In addition to pure encryptions, the LOL-MINI and LOL-DOUBLE stream ciphers can also be applied in a stream-cipher-then-MAC strategy to make an AEAD scheme

    Implementation and performance of a RLWE-based commitment scheme and ZKPoK for its linear and multiplicative relations

    Get PDF
    In this paper we provide the implementation details and performance analysis of the lattice-based post-quantum commitment scheme introduced by Martínez and Morillo in their work titled «RLWE-Based Zero-Knowledge Proofs for Linear and Multiplicative Relations» together with the corresponding Zero-Knowledge Proofs of Knowledge (ZKPoK) of valid openings, linear and multiplicative relations among committed elements. We bridge the gap between the existing theoretical proposals and practical applications, thoroughly revisiting the security proofs of the aforementioned paper to obtain tight conditions that allow us to find the best sets of parameters for actual instantiations of the commitment scheme and its companion ZKPoK. Our implementation is very flexible and its parameters can be adjusted to obtain a trade-off between speed and memory usage, analyzing how suitable for practical use are the underlying lattice-based techniques. Moreover, our implementation further extends the literature of exact Zero-Knowledge proofs, providing ZKPoK of committed elements without any soundness slack

    Data-Driven Evaluation of In-Vehicle Information Systems

    Get PDF
    Today’s In-Vehicle Information Systems (IVISs) are featurerich systems that provide the driver with numerous options for entertainment, information, comfort, and communication. Drivers can stream their favorite songs, read reviews of nearby restaurants, or change the ambient lighting to their liking. To do so, they interact with large center stack touchscreens that have become the main interface between the driver and IVISs. To interact with these systems, drivers must take their eyes off the road which can impair their driving performance. This makes IVIS evaluation critical not only to meet customer needs but also to ensure road safety. The growing number of features, the distraction caused by large touchscreens, and the impact of driving automation on driver behavior pose significant challenges for the design and evaluation of IVISs. Traditionally, IVISs are evaluated qualitatively or through small-scale user studies using driving simulators. However, these methods are not scalable to the growing number of features and the variety of driving scenarios that influence driver interaction behavior. We argue that data-driven methods can be a viable solution to these challenges and can assist automotive User Experience (UX) experts in evaluating IVISs. Therefore, we need to understand how data-driven methods can facilitate the design and evaluation of IVISs, how large amounts of usage data need to be visualized, and how drivers allocate their visual attention when interacting with center stack touchscreens. In Part I, we present the results of two empirical studies and create a comprehensive understanding of the role that data-driven methods currently play in the automotive UX design process. We found that automotive UX experts face two main conflicts: First, results from qualitative or small-scale empirical studies are often not valued in the decision-making process. Second, UX experts often do not have access to customer data and lack the means and tools to analyze it appropriately. As a result, design decisions are often not user-centered and are based on subjective judgments rather than evidence-based customer insights. Our results show that automotive UX experts need data-driven methods that leverage large amounts of telematics data collected from customer vehicles. They need tools to help them visualize and analyze customer usage data and computational methods to automatically evaluate IVIS designs. In Part II, we present ICEBOAT, an interactive user behavior analysis tool for automotive user interfaces. ICEBOAT processes interaction data, driving data, and glance data, collected over-the-air from customer vehicles and visualizes it on different levels of granularity. Leveraging our multi-level user behavior analysis framework, it enables UX experts to effectively and efficiently evaluate driver interactions with touchscreen-based IVISs concerning performance and safety-related metrics. In Part III, we investigate drivers’ multitasking behavior and visual attention allocation when interacting with center stack touchscreens while driving. We present the first naturalistic driving study to assess drivers’ tactical and operational self-regulation with center stack touchscreens. Our results show significant differences in drivers’ interaction and glance behavior in response to different levels of driving automation, vehicle speed, and road curvature. During automated driving, drivers perform more interactions per touchscreen sequence and increase the time spent looking at the center stack touchscreen. These results emphasize the importance of context-dependent driver distraction assessment of driver interactions with IVISs. Motivated by this we present a machine learning-based approach to predict and explain the visual demand of in-vehicle touchscreen interactions based on customer data. By predicting the visual demand of yet unseen touchscreen interactions, our method lays the foundation for automated data-driven evaluation of early-stage IVIS prototypes. The local and global explanations provide additional insights into how design artifacts and driving context affect drivers’ glance behavior. Overall, this thesis identifies current shortcomings in the evaluation of IVISs and proposes novel solutions based on visual analytics and statistical and computational modeling that generate insights into driver interaction behavior and assist UX experts in making user-centered design decisions

    Small Stretch Problem of the DCT Scheme and How to Fix it

    Get PDF
    DCT is a beyond-birthday-bound~(BBB) deterministic authenticated encryption~(DAE) mode proposed by Forler et al. in ACISP 2016, ensuring integrity by redundancy. The instantiation scheme of DCT employs the BRW polynomial, which is more efficient than the usual polynomial function in GCM by reducing half of the multiplication operations. However, we show that DCT suffers from a small stretch problem similar to GCM. When the stretch length Ď„\tau is small, choosing a special mm-block message, we can reduce the number of queries required by a successful forgery to O(2Ď„/m)\mathcal{O}(2^{\tau}/m). We emphasize that this attack efficiently balances space and time complexity, but does not contradict the security bounds of DCT. Finally, we propose an improved scheme named Robust DCT~(RDCT) with a minor change to DCT, which improves the security when Ď„\tau is small and makes it resist the above attack

    Learning-Based Ubiquitous Sensing For Solving Real-World Problems

    Get PDF
    Recently, as the Internet of Things (IoT) technology has become smaller and cheaper, ubiquitous sensing ability within these devices has become increasingly accessible. Learning methods have also become more complex in the field of computer science ac- cordingly. However, there remains a gap between these learning approaches and many problems in other disciplinary fields. In this dissertation, I investigate four different learning-based studies via ubiquitous sensing for solving real-world problems, such as in IoT security, athletics, and healthcare. First, I designed an online intrusion detection system for IoT devices via power auditing. To realize the real-time system, I created a lightweight power auditing device. With this device, I developed a distributed Convolutional Neural Network (CNN) for online inference. I demonstrated that the distributed system design is secure, lightweight, accurate, real-time, and scalable. Furthermore, I characterized potential Information-stealer attacks via power auditing. To defend against this potential exfiltration attack, a prototype system was built on top of the botnet detection system. In a testbed environment, I defined and deployed an IoT Information-stealer attack. Then, I designed a detection classifier. Altogether, the proposed system is able to identify malicious behavior on endpoint IoT devices via power auditing. Next, I enhanced athletic performance via ubiquitous sensing and machine learning techniques. I first designed a metric called LAX-Score to quantify a collegiate lacrosse team’s athletic performance. To derive this metric, I utilized feature selection and weighted regression. Then, the proposed metric was statistically validated on over 700 games from the last three seasons of NCAA Division I women’s lacrosse. I also exam- ined the biometric sensing dataset obtained from a collegiate team’s athletes over the course of a season. I then identified the practice features that are most correlated with high-performance games. Experimental results indicate that LAX-Score provides insight into athletic performance quality beyond wins and losses. Finally, I studied the data of patients with Parkinson’s Disease. I secured the Inertial Measurement Unit (IMU) sensing data of 30 patients while they conducted pre-defined activities. Using this dataset, I measured tremor events during drawing activities for more convenient tremor screening. Our preliminary analysis demonstrates that IMU sensing data can identify potential tremor events in daily drawing or writing activities. For future work, deep learning-based techniques will be used to extract features of the tremor in real-time. Overall, I designed and applied learning-based methods across different fields to solve real-world problems. The results show that combining learning methods with domain knowledge enables the formation of solutions
    • …
    corecore