2,967 research outputs found
Your click decides your fate: Inferring Information Processing and Attrition Behavior from MOOC Video Clickstream Interactions
In this work, we explore video lecture interaction in Massive Open Online
Courses (MOOCs), which is central to student learning experience on these
educational platforms. As a research contribution, we operationalize video
lecture clickstreams of students into cognitively plausible higher level
behaviors, and construct a quantitative information processing index, which can
aid instructors to better understand MOOC hurdles and reason about
unsatisfactory learning outcomes. Our results illustrate how such a metric
inspired by cognitive psychology can help answer critical questions regarding
students' engagement, their future click interactions and participation
trajectories that lead to in-video & course dropouts. Implications for research
and practice are discusse
Qd-tree: Learning Data Layouts for Big Data Analytics
Corporations today collect data at an unprecedented and accelerating scale,
making the need to run queries on large datasets increasingly important.
Technologies such as columnar block-based data organization and compression
have become standard practice in most commercial database systems. However, the
problem of best assigning records to data blocks on storage is still open. For
example, today's systems usually partition data by arrival time into row
groups, or range/hash partition the data based on selected fields. For a given
workload, however, such techniques are unable to optimize for the important
metric of the number of blocks accessed by a query. This metric directly
relates to the I/O cost, and therefore performance, of most analytical queries.
Further, they are unable to exploit additional available storage to drive this
metric down further.
In this paper, we propose a new framework called a query-data routing tree,
or qd-tree, to address this problem, and propose two algorithms for their
construction based on greedy and deep reinforcement learning techniques.
Experiments over benchmark and real workloads show that a qd-tree can provide
physical speedups of more than an order of magnitude compared to current
blocking schemes, and can reach within 2X of the lower bound for data skipping
based on selectivity, while providing complete semantic descriptions of created
blocks.Comment: ACM SIGMOD 202
Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques
The rapid growth of demanding applications in domains applying multimedia
processing and machine learning has marked a new era for edge and cloud
computing. These applications involve massive data and compute-intensive tasks,
and thus, typical computing paradigms in embedded systems and data centers are
stressed to meet the worldwide demand for high performance. Concurrently, the
landscape of the semiconductor field in the last 15 years has constituted power
as a first-class design concern. As a result, the community of computing
systems is forced to find alternative design approaches to facilitate
high-performance and/or power-efficient computing. Among the examined
solutions, Approximate Computing has attracted an ever-increasing interest,
with research works applying approximations across the entire traditional
computing stack, i.e., at software, hardware, and architectural levels. Over
the last decade, there is a plethora of approximation techniques in software
(programs, frameworks, compilers, runtimes, languages), hardware (circuits,
accelerators), and architectures (processors, memories). The current article is
Part I of our comprehensive survey on Approximate Computing, and it reviews its
motivation, terminology and principles, as well it classifies and presents the
technical details of the state-of-the-art software and hardware approximation
techniques.Comment: Under Review at ACM Computing Survey
Pervasive learning analytics for fostering learners' self-regulation
Today's tertiary STEM (Science, Technology, Engineering and Mathematics) education in Europe poses problems to both teachers and students.
With growing enrolment numbers, and numbers of teaching staff that are outmatched by this growth, student-teacher contact becomes more and more difficult to provide. Therefore, students are required to quickly adopt self-regulated and autonomous learning styles when entering European universities.
Furthermore, teachers are required to divide their attention between large numbers of students. As a consequence, classical teaching formats of STEM education which often encompass experimentation or active exploration, become harder to implement.
Educational software holds the promise of easing these problems, or, if not fully solving, at least of making them less acute: Learning Analytics generated by such software can foster self-regulation by providing students with both formative feedback and assessments. Educational software, in form of collaborative social media, makes it easier for teachers to collaborate, allows to reduce their workload and enables learning and teaching formats otherwise infeasible in large classes.
The contribution of this thesis is threefold: Firstly, it reports on a social medium for tertiary STEM education called "Backstage2 / Projects" aimed specifically at these points: Improving learners' self-regulation by providing pervasive Learning Analytics, fostering teacher collaboration so as to reduce their workload, and providing means to deploy a variety of classical and novel learning and teaching formats in large classes. Secondly, it reports on several case studies conducted with that medium which point at the effectiveness of the medium and its provided Learning Analytics to increase learners' self-regulation, reduce teachers' workload, and improve how students learn.
Thirdly, this thesis reports on findings from Learning Analytics which could be used in the future in designing further teaching and learning formats or case studies, yielding a rich perspective for future research and indications for improving tertiary STEM education
StreamLearner: Distributed Incremental Machine Learning on Event Streams: Grand Challenge
Today, massive amounts of streaming data from smart devices need to be
analyzed automatically to realize the Internet of Things. The Complex Event
Processing (CEP) paradigm promises low-latency pattern detection on event
streams. However, CEP systems need to be extended with Machine Learning (ML)
capabilities such as online training and inference in order to be able to
detect fuzzy patterns (e.g., outliers) and to improve pattern recognition
accuracy during runtime using incremental model training. In this paper, we
propose a distributed CEP system denoted as StreamLearner for ML-enabled
complex event detection. The proposed programming model and data-parallel
system architecture enable a wide range of real-world applications and allow
for dynamically scaling up and out system resources for low-latency,
high-throughput event processing. We show that the DEBS Grand Challenge 2017
case study (i.e., anomaly detection in smart factories) integrates seamlessly
into the StreamLearner API. Our experiments verify scalability and high event
throughput of StreamLearner.Comment: Christian Mayer, Ruben Mayer, and Majd Abdo. 2017. StreamLearner:
Distributed Incremental Machine Learning on Event Streams: Grand Challenge.
In Proceedings of the 11th ACM International Conference on Distributed and
Event-based Systems (DEBS '17), 298-30
- …