61 research outputs found

    Audio-Visual Egocentric Action Recognition

    Get PDF

    Multimodal sentiment analysis in real-life videos

    Get PDF
    This thesis extends the emerging field of multimodal sentiment analysis of real-life videos, taking two components into consideration: the emotion and the emotion's target. The emotion component of media is traditionally represented as a segment-based intensity model of emotion classes. This representation is replaced here by a value- and time-continuous view. Adjacent research fields, such as affective computing, have largely neglected the linguistic information available from automatic transcripts of audio-video material. As is demonstrated here, this text modality is well-suited for time- and value-continuous prediction. Moreover, source-specific problems, such as trustworthiness, have been largely unexplored so far. This work examines perceived trustworthiness of the source, and its quantification, in user-generated video data and presents a possible modelling path. Furthermore, the transfer between the continuous and discrete emotion representations is explored in order to summarise the emotional context at a segment level. The other component deals with the target of the emotion, for example, the topic the speaker is addressing. Emotion targets in a video dataset can, as is shown here, be coherently extracted based on automatic transcripts without limiting a priori parameters, such as the expected number of targets. Furthermore, alternatives to purely linguistic investigation in predicting targets, such as knowledge-bases and multimodal systems, are investigated. A new dataset is designed for this investigation, and, in conjunction with proposed novel deep neural networks, extensive experiments are conducted to explore the components described above. The developed systems show robust prediction results and demonstrate strengths of the respective modalities, feature sets, and modelling techniques. Finally, foundations are laid for cross-modal information prediction systems with applications to the correction of corrupted in-the-wild signals from real-life videos

    Dutkat: A Privacy-Preserving System for Automatic Catch Documentation and Illegal Activity Detection in the Fishing Industry

    Get PDF
    United Nations' Sustainable Development Goal 14 aims to conserve and sustainably use the oceans and their resources for the benefit of people and the planet. This includes protecting marine ecosystems, preventing pollution, and overfishing, and increasing scientific understanding of the oceans. Achieving this goal will help ensure the health and well-being of marine life and the millions of people who rely on the oceans for their livelihoods. In order to ensure sustainable fishing practices, it is important to have a system in place for automatic catch documentation. This thesis presents our research on the design and development of Dutkat, a privacy-preserving, edge-based system for catch documentation and detection of illegal activities in the fishing industry. Utilising machine learning techniques, Dutkat can analyse large amounts of data and identify patterns that may indicate illegal activities such as overfishing or illegal discard of catch. Additionally, the system can assist in catch documentation by automating the process of identifying and counting fish species, thus reducing potential human error and increasing efficiency. Specifically, our research has consisted of the development of various components of the Dutkat system, evaluation through experimentation, exploration of existing data, and organization of machine learning competitions. We have also implemented it from a compliance-by-design perspective to ensure that the system is in compliance with data protection laws and regulations such as GDPR. Our goal with Dutkat is to promote sustainable fishing practices, which aligns with the Sustainable Development Goal 14, while simultaneously protecting the privacy and rights of fishing crews

    Alzheimer’s Dementia Recognition Through Spontaneous Speech

    Get PDF

    Natural language processing methods for detecting and measuring the impact of scientific work beyond academia

    Get PDF
    Scientific research has a profoundly important impact on our society and the environment. However, the multifaceted nature of this impact makes it particularly difficult to measure and, as shown in this thesis, it cannot be measured using traditional academic impact metrics that focus on counting citations and publications. Furthermore, existing societal and environmental impact metrics are only applicable to one scientific discipline or geography or are expensive processes run irregularly by government agencies. This thesis investigates natural language processing methods for identifying and measuring societal and environmental scientific impact and how such impact is reported in the news. A novel regression task and model are presented for identifying and quantifying this impact based on text extracted from scientific papers and news articles that discuss them. This is enabled by developing methods for linking and comparing news articles with academic papers that they discuss, whilst accounting for the structural and linguistic differences between the two types of document. Text encoding strategies for representation and comparison of long documents are also a focus of the thesis. A new cross-domain, co-reference resolution task between news articles and scientific papers is introduced so that co-referring entities may be used as anchors for aligning the two types of documents. Through comparisons of news article excerpts and sentences from corresponding scientific papers, it is shown that scientific discourse structure and argumentation in scientific papers is a likely predictor of which information will be presented prominently in news articles. This work introduces several novel natural language task settings for which no pre existing data sets exist. This has necessitated the production of new human-annotated datasets which were built using bespoke annotation tools that use semi-supervised learning to accelerate the labelling process and minimise the cognitive load of the task on the annotator. The thesis also makes use of low resource approaches including few-shot and multi-task learning to facilitate the development of accurate models with small data-sets. The resulting annotated data-sets, annotation tools and guidelines along with state-of-the art machine learning models are all made available as open assets. This thesis contributes new ways to measure societal and environmental impact of scientific work and help scientists and funding bodies understand how work is being used by others, justify the spending of public funding and inform better public engagement

    LIPIcs, Volume 277, GIScience 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 277, GIScience 2023, Complete Volum

    Machine learning at the nanoscale

    Get PDF
    Although scanning probe microscopy (SPM) techniques have allowed researchers to interact with the nanoscale for decades now, little improvement has been made to the incredibly manual, time consuming process of setting up, running, and analysing the results of these experiments, often arising due to the constantly varying shape of the probe apex. Unlike traditional computing methods, machine learning methods (with neural networks in particular) are considerably more capable of automating subjective tasks such as these, and we are only just beginning to explore the potential applications of this technology in SPM. In this thesis we explore a number of areas where machine learning could potentially massively change the way we go about SPM experimentation. We begin by discussing the history, theory, and experimental concepts of scanning tunnelling microscopy (STM), atomic force microscopy (AFM), and normal-incidence-x-ray standing wave (NIXSW). We then explore the makeup of a neural network and demonstrate how they can be applied to a variety of use-cases in SPM, including classification and policy prediction. Moving to the experimental chapters, we first discuss how we can successfully distinguish between STM tip states of the H:Si(100), Au(111) and Cu(111) surfaces. We also show that by adapting this network to work in real time, we improve performance while requiring on the order of 100x less data. We next discuss our attempts to combine these networks with expert examples to intelligently maintain tip apex sharpness during experimentation, envisioning an end-to-end automatic experiment. Because one of the main difficulties in applying machine learning is the frequent need to manually label data, we then show how we can use Monte Carlo simulations of self-organised AFM nanostructures to automatically label training data for a network, and then combine it with classical statistics and preprocessing to find specific structures in a mixed, messy dataset of real, experimental AFM images. As part of this, we also build a network to denoise experimental images. Finally, we present NIXSW results from an investigation into the temperature dependence of H20@C60, discussing the potential to use unsupervised clustering techniques to distinguish between noisy human-indistinguishable spectra to overcome limitations in data collection

    Mobile Robots Navigation

    Get PDF
    Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described
    • …
    corecore