40 research outputs found

    The Big Five:Addressing Recurrent Multimodal Learning Data Challenges

    Get PDF
    The analysis of multimodal data in learning is a growing field of research, which has led to the development of different analytics solutions. However, there is no standardised approach to handle multimodal data. In this paper, we describe and outline a solution for five recurrent challenges in the analysis of multimodal data: the data collection, storing, annotation, processing and exploitation. For each of these challenges, we envision possible solutions. The prototypes for some of the proposed solutions will be discussed during the Multimodal Challenge of the fourth Learning Analytics & Knowledge Hackathon, a two-day hands-on workshop in which the authors will open up the prototypes for trials, validation and feedback

    Multimodal Challenge: Analytics Beyond User-computer Interaction Data

    Get PDF
    This contribution describes one the challenges explored in the Fourth LAK Hackathon. This challenge aims at shifting the focus from learning situations which can be easily traced through user-computer interactions data and concentrate more on user-world interactions events, typical of co-located and practice-based learning experiences. This mission, pursued by the multimodal learning analytics (MMLA) community, seeks to bridge the gap between digital and physical learning spaces. The “multimodal” approach consists in combining learners’ motoric actions with physiological responses and data about the learning contexts. These data can be collected through multiple wearable sensors and Internet of Things (IoT) devices. This Hackathon table will confront with three main challenges arising from the analysis and valorisation of multimodal datasets: 1) the data collection and storing, 2) the data annotation, 3) the data processing and exploitation. Some research questions which will be considered in this Hackathon challenge are the following: how to process the raw sensor data streams and extract relevant features? which data mining and machine learning techniques can be applied? how can we compare two action recordings? How to combine sensor data with Experience API (xAPI)? what are meaningful visualisations for these data

    The challenges of data in future pandemics

    Get PDF
    This work was supported by Engineering and Physical Sciences Research Council (EPSRC) grant no. EP/R014604/1. M.C. was funded by the Engineering and Physical Sciences Research Council (EPSRC) grant no. EP/V054236/1. G.M. is supported by the Scottish Government’s Rural and Environment Science and Analytical Services Division (RESAS). J.P-G’s work is supported by funding from the UK Health Security Agency and the UK Department of Health and Social Care. L.P. is funded by the Wellcome Trust, UK and the Royal Society, UK (grant no. 202562/Z/16/Z). L.P. is supported by the UK Research and Innovation (UKRI) through the JUNIPER modelling consortium (grant no. MR/V038613/1). L.P. is also supported by The Alan Turing Institute for Data Science and Artificial Intelligence, UK. R.R was supported by the Natural Environment Research Council (NERC) grant no. NE/T004193/1 and NE/T010355/1, the Engineering and Physical Sciences Research Council (EPSRC) grant no. EP/V054236/1 and the Science and Technology Facilities Council (STFC) grant no. ST/V006126/1.The use of data has been essential throughout the unfolding COVID-19 pandemic. We have needed it to populate our models, inform our understanding, and shape our responses to the disease. However, data has not always been easy to find and access, it has varied in quality and coverage, been difficult to reuse or repurpose. This paper reviews these and other challenges and recommends steps to develop a data ecosystem better able to deal with future pandemics by better supporting preparedness, prevention, detection and response.Publisher PDFPeer reviewe

    Spatial and Temporal Sentiment Analysis of Twitter data

    Get PDF
    The public have used Twitter world wide for expressing opinions. This study focuses on spatio-temporal variation of georeferenced Tweets’ sentiment polarity, with a view to understanding how opinions evolve on Twitter over space and time and across communities of users. More specifically, the question this study tested is whether sentiment polarity on Twitter exhibits specific time-location patterns. The aim of the study is to investigate the spatial and temporal distribution of georeferenced Twitter sentiment polarity within the area of 1 km buffer around the Curtin Bentley campus boundary in Perth, Western Australia. Tweets posted in campus were assigned into six spatial zones and four time zones. A sentiment analysis was then conducted for each zone using the sentiment analyser tool in the Starlight Visual Information System software. The Feature Manipulation Engine was employed to convert non-spatial files into spatial and temporal feature class. The spatial and temporal distribution of Twitter sentiment polarity patterns over space and time was mapped using Geographic Information Systems (GIS). Some interesting results were identified. For example, the highest percentage of positive Tweets occurred in the social science area, while science and engineering and dormitory areas had the highest percentage of negative postings. The number of negative Tweets increases in the library and science and engineering areas as the end of the semester approaches, reaching a peak around an exam period, while the percentage of negative Tweets drops at the end of the semester in the entertainment and sport and dormitory area. This study will provide some insights into understanding students and staff ’s sentiment variation on Twitter, which could be useful for university teaching and learning management

    European Handbook of Crowdsourced Geographic Information

    Get PDF
    "This book focuses on the study of the remarkable new source of geographic information that has become available in the form of user-generated content accessible over the Internet through mobile and Web applications. The exploitation, integration and application of these sources, termed volunteered geographic information (VGI) or crowdsourced geographic information (CGI), offer scientists an unprecedented opportunity to conduct research on a variety of topics at multiple scales and for diversified objectives. The Handbook is organized in five parts, addressing the fundamental questions: What motivates citizens to provide such information in the public domain, and what factors govern/predict its validity?What methods might be used to validate such information? Can VGI be framed within the larger domain of sensor networks, in which inert and static sensors are replaced or combined by intelligent and mobile humans equipped with sensing devices? What limitations are imposed on VGI by differential access to broadband Internet, mobile phones, and other communication technologies, and by concerns over privacy? How do VGI and crowdsourcing enable innovation applications to benefit human society? Chapters examine how crowdsourcing techniques and methods, and the VGI phenomenon, have motivated a multidisciplinary research community to identify both fields of applications and quality criteria depending on the use of VGI. Besides harvesting tools and storage of these data, research has paid remarkable attention to these information resources, in an age when information and participation is one of the most important drivers of development. The collection opens questions and points to new research directions in addition to the findings that each of the authors demonstrates. Despite rapid progress in VGI research, this Handbook also shows that there are technical, social, political and methodological challenges that require further studies and research.

    Digital Traces of the Mind::Using Smartphones to Capture Signals of Well-Being in Individuals

    Get PDF
    General context and questions Adolescents and young adults typically use their smartphone several hours a day. Although there are concerns about how such behaviour might affect their well-being, the popularity of these powerful devices also opens novel opportunities for monitoring well-being in daily life. If successful, monitoring well-being in daily life provides novel opportunities to develop future interventions that provide personalized support to individuals at the moment they require it (just-in-time adaptive interventions). Taking an interdisciplinary approach with insights from communication, computational, and psychological science, this dissertation investigated the relation between smartphone app use and well-being and developed machine learning models to estimate an individual’s well-being based on how they interact with their smartphone. To elucidate the relation between smartphone trace data and well-being and to contribute to the development of technologies for monitoring well-being in future clinical practice, this dissertation addressed two overarching questions:RQ1: Can we find empirical support for theoretically motivated relations between smartphone trace data and well-being in individuals? RQ2: Can we use smartphone trace data to monitor well-being in individuals?Aims The first aim of this dissertation was to quantify the relation between the collected smartphone trace data and momentary well-being at the sample level, but also for each individual, following recent conceptual insights and empirical findings in psychological, communication, and computational science. A strength of this personalized (or idiographic) approach is that it allows us to capture how individuals might differ in how smartphone app use is related to their well-being. Considering such interindividual differences is important to determine if some individuals might potentially benefit from spending more time on their smartphone apps whereas others do not or even experience adverse effects. The second aim of this dissertation was to develop models for monitoring well-being in daily life. The present work pursued this transdisciplinary aim by taking a machine learning approach and evaluating to what extent we might estimate an individual’s well-being based on their smartphone trace data. If such traces can be used for this purpose by helping to pinpoint when individuals are unwell, they might be a useful data source for developing future interventions that provide personalized support to individuals at the moment they require it (just-in-time adaptive interventions). With this aim, the dissertation follows current developments in psychoinformatics and psychiatry, where much research resources are invested in using smartphone traces and similar data (obtained with smartphone sensors and wearables) to develop technologies for detecting whether an individual is currently unwell or will be in the future. Data collection and analysis This work combined novel data collection techniques (digital phenotyping and experience sampling methodology) for measuring smartphone use and well-being in the daily lives of 247 student participants. For a period up to four months, a dedicated application installed on participants’ smartphones collected smartphone trace data. In the same time period, participants completed a brief smartphone-based well-being survey five times a day (for 30 days in the first month and 30 days in the fourth month; up to 300 assessments in total). At each measurement, this survey comprised questions about the participants’ momentary level of procrastination, stress, and fatigue, while sleep duration was measured in the morning. Taking a time-series and machine learning approach to analysing these data, I provide the following contributions: Chapter 2 investigates the person-specific relation between passively logged usage of different application types and momentary subjective procrastination, Chapter 3 develops machine learning methodology to estimate sleep duration using smartphone trace data, Chapter 4 combines machine learning and explainable artificial intelligence to discover smartphone-tracked digital markers of momentary subjective stress, Chapter 5 uses a personalized machine learning approach to evaluate if smartphone trace data contains behavioral signs of fatigue. Collectively, these empirical studies provide preliminary answers to the overarching questions of this dissertation.Summary of results With respect to the theoretically motivated relations between smartphone trace data and wellbeing (RQ1), we found that different patterns in smartphone trace data, from time spent on social network, messenger, video, and game applications to smartphone-tracked sleep proxies, are related to well-being in individuals. The strength and nature of this relation depends on the individual and app usage pattern under consideration. The relation between smartphone app use patterns and well-being is limited in most individuals, but relatively strong in a minority. Whereas some individuals might benefit from using specific app types, others might experience decreases in well-being when spending more time on these apps. With respect to the question whether we might use smartphone trace data to monitor well-being in individuals (RQ2), we found that smartphone trace data might be useful for this purpose in some individuals and to some extent. They appear most relevant in the context of sleep monitoring (Chapter 3) and have the potential to be included as one of several data sources for monitoring momentary procrastination (Chapter 2), stress (Chapter 4), and fatigue (Chapter 5) in daily life. Outlook Future interdisciplinary research is needed to investigate whether the relationship between smartphone use and well-being depends on the nature of the activities performed on these devices, the content they present, and the context in which they are used. Answering these questions is essential to unravel the complex puzzle of developing technologies for monitoring well-being in daily life.<br/

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum

    Democratizing machine learning

    Get PDF
    Modelle des maschinellen Lernens sind zunehmend in der Gesellschaft verankert, oft in Form von automatisierten Entscheidungsprozessen. Ein wesentlicher Grund dafür ist die verbesserte Zugänglichkeit von Daten, aber auch von Toolkits für maschinelles Lernen, die den Zugang zu Methoden des maschinellen Lernens für Nicht-Experten ermöglichen. Diese Arbeit umfasst mehrere Beiträge zur Demokratisierung des Zugangs zum maschinellem Lernen, mit dem Ziel, einem breiterem Publikum Zugang zu diesen Technologien zu er- möglichen. Die Beiträge in diesem Manuskript stammen aus mehreren Bereichen innerhalb dieses weiten Gebiets. Ein großer Teil ist dem Bereich des automatisierten maschinellen Lernens (AutoML) und der Hyperparameter-Optimierung gewidmet, mit dem Ziel, die oft mühsame Aufgabe, ein optimales Vorhersagemodell für einen gegebenen Datensatz zu finden, zu vereinfachen. Dieser Prozess besteht meist darin ein für vom Benutzer vorgegebene Leistungsmetrik(en) optimales Modell zu finden. Oft kann dieser Prozess durch Lernen aus vorhergehenden Experimenten verbessert oder beschleunigt werden. In dieser Arbeit werden drei solcher Methoden vorgestellt, die entweder darauf abzielen, eine feste Menge möglicher Hyperparameterkonfigurationen zu erhalten, die wahrscheinlich gute Lösungen für jeden neuen Datensatz enthalten, oder Eigenschaften der Datensätze zu nutzen, um neue Konfigurationen vorzuschlagen. Darüber hinaus wird eine Sammlung solcher erforderlichen Metadaten zu den Experimenten vorgestellt, und es wird gezeigt, wie solche Metadaten für die Entwicklung und als Testumgebung für neue Hyperparameter- Optimierungsmethoden verwendet werden können. Die weite Verbreitung von ML-Modellen in vielen Bereichen der Gesellschaft erfordert gleichzeitig eine genauere Untersuchung der Art und Weise, wie aus Modellen abgeleitete automatisierte Entscheidungen die Gesellschaft formen, und ob sie möglicherweise Individuen oder einzelne Bevölkerungsgruppen benachteiligen. In dieser Arbeit wird daher ein AutoML-Tool vorgestellt, das es ermöglicht, solche Überlegungen in die Suche nach einem optimalen Modell miteinzubeziehen. Diese Forderung nach Fairness wirft gleichzeitig die Frage auf, ob die Fairness eines Modells zuverlässig geschätzt werden kann, was in einem weiteren Beitrag in dieser Arbeit untersucht wird. Da der Zugang zu Methoden des maschinellen Lernens auch stark vom Zugang zu Software und Toolboxen abhängt, sind mehrere Beiträge in Form von Software Teil dieser Arbeit. Das R-Paket mlr3pipelines ermöglicht die Einbettung von Modellen in sogenan- nte Machine Learning Pipelines, die Vor- und Nachverarbeitungsschritte enthalten, die im maschinellen Lernen und AutoML häufig benötigt werden. Das mlr3fairness R-Paket hingegen ermöglicht es dem Benutzer, Modelle auf potentielle Benachteiligung hin zu über- prüfen und diese durch verschiedene Techniken zu reduzieren. Eine dieser Techniken, multi-calibration wurde darüberhinaus als seperate Software veröffentlicht.Machine learning artifacts are increasingly embedded in society, often in the form of automated decision-making processes. One major reason for this, along with methodological improvements, is the increasing accessibility of data but also machine learning toolkits that enable access to machine learning methodology for non-experts. The core focus of this thesis is exactly this – democratizing access to machine learning in order to enable a wider audience to benefit from its potential. Contributions in this manuscript stem from several different areas within this broader area. A major section is dedicated to the field of automated machine learning (AutoML) with the goal to abstract away the tedious task of obtaining an optimal predictive model for a given dataset. This process mostly consists of finding said optimal model, often through hyperparameter optimization, while the user in turn only selects the appropriate performance metric(s) and validates the resulting models. This process can be improved or sped up by learning from previous experiments. Three such methods one with the goal to obtain a fixed set of possible hyperparameter configurations that likely contain good solutions for any new dataset and two using dataset characteristics to propose new configurations are presented in this thesis. It furthermore presents a collection of required experiment metadata and how such meta-data can be used for the development and as a test bed for new hyperparameter optimization methods. The pervasion of models derived from ML in many aspects of society simultaneously calls for increased scrutiny with respect to how such models shape society and the eventual biases they exhibit. Therefore, this thesis presents an AutoML tool that allows incorporating fairness considerations into the search for an optimal model. This requirement for fairness simultaneously poses the question of whether we can reliably estimate a model’s fairness, which is studied in a further contribution in this thesis. Since access to machine learning methods also heavily depends on access to software and toolboxes, several contributions in the form of software are part of this thesis. The mlr3pipelines R package allows for embedding models in so-called machine learning pipelines that include pre- and postprocessing steps often required in machine learning and AutoML. The mlr3fairness R package on the other hand enables users to audit models for potential biases as well as reduce those biases through different debiasing techniques. One such technique, multi-calibration is published as a separate software package, mcboost
    corecore