45 research outputs found

    Intelligent Data Analytics using Deep Learning for Data Science

    Get PDF
    Nowadays, data science stimulates the interest of academics and practitioners because it can assist in the extraction of significant insights from massive amounts of data. From the years 2018 through 2025, the Global Datasphere is expected to rise from 33 Zettabytes to 175 Zettabytes, according to the International Data Corporation. This dissertation proposes an intelligent data analytics framework that uses deep learning to tackle several difficulties when implementing a data science application. These difficulties include dealing with high inter-class similarity, the availability and quality of hand-labeled data, and designing a feasible approach for modeling significant correlations in features gathered from various data sources. The proposed intelligent data analytics framework employs a novel strategy for improving data representation learning by incorporating supplemental data from various sources and structures. First, the research presents a multi-source fusion approach that utilizes confident learning techniques to improve the data quality from many noisy sources. Meta-learning methods based on advanced techniques such as the mixture of experts and differential evolution combine the predictive capacity of individual learners with a gating mechanism, ensuring that only the most trustworthy features or predictions are integrated to train the model. Then, a Multi-Level Convolutional Fusion is presented to train a model on the correspondence between local-global deep feature interactions to identify easily confused samples of different classes. The convolutional fusion is further enhanced with the power of Graph Transformers, aggregating the relevant neighboring features in graph-based input data structures and achieving state-of-the-art performance on a large-scale building damage dataset. Finally, weakly-supervised strategies, noise regularization, and label propagation are proposed to train a model on sparse input labeled data, ensuring the model\u27s robustness to errors and supporting the automatic expansion of the training set. The suggested approaches outperformed competing strategies in effectively training a model on a large-scale dataset of 500k photos, with just about 7% of the images annotated by a human. The proposed framework\u27s capabilities have benefited various data science applications, including fluid dynamics, geometric morphometrics, building damage classification from satellite pictures, disaster scene description, and storm-surge visualization

    Multi-source Multimodal Data and Deep Learning for Disaster Response: A Systematic Review.

    Get PDF
    Mechanisms for sharing information in a disaster situation have drastically changed due to new technological innovations throughout the world. The use of social media applications and collaborative technologies for information sharing have become increasingly popular. With these advancements, the amount of data collected increases daily in different modalities, such as text, audio, video, and images. However, to date, practical Disaster Response (DR) activities are mostly depended on textual information, such as situation reports and email content, and the benefit of other media is often not realised. Deep Learning (DL) algorithms have recently demonstrated promising results in extracting knowledge from multiple modalities of data, but the use of DL approaches for DR tasks has thus far mostly been pursued in an academic context. This paper conducts a systematic review of 83 articles to identify the successes, current and future challenges, and opportunities in using DL for DR tasks. Our analysis is centred around the components of learning, a set of aspects that govern the application of Machine learning (ML) for a given problem domain. A flowchart and guidance for future research are developed as an outcome of the analysis to ensure the benefits of DL for DR activities are utilized.Publishe

    Weakly Supervised Learning for Multi-Image Synthesis

    Get PDF
    Machine learning-based approaches have been achieving state-of-the-art results on many computer vision tasks. While deep learning and convolutional networks have been incredibly popular, these approaches come at the expense of huge amounts of labeled data required for training. Manually annotating large amounts of data, often millions of images in a single dataset, is costly and time consuming. To deal with the problem of data annotation, the research community has been exploring approaches that require less amount of labelled data. The central problem that we consider in this research is image synthesis without any manual labeling. Image synthesis is a classic computer vision task that requires understanding of image contents and their semantic and geometric properties. We propose that we can train image synthesis models by relying on sequences of videos and using weakly supervised learning. Large amounts of unlabeled data are freely available on the internet. We propose to set up the training in a multi-image setting so that we can use one of the images as the target - this allows us to rely only on images for training and removes the need for manual annotations. We demonstrate three main contributions in this work. First, we present a method of fusing multiple noisy overhead images to make a single, artifact-free image. We present a weakly supervised method that relies on crowd-sourced labels from online maps and a completely unsupervised variant that only requires a series of satellite images as inputs. Second, we propose a single-image novel view synthesis method for complex, outdoor scenes. We propose a learning-based method that uses pairs of nearby images captured on urban roads and their respective GPS coordinates as supervision. We show that a model trained with this automatically captured data can render a new view of a scene that can be as far as 10 meters from the input image. Third, we consider the problem of synthesizing new images of a scene under different conditions, such as time of day and season, based on a single input image. As opposed to existing methods, we do not need manual annotations for transient attributes, such as fog or snow, for training. We train our model by using streams of images captured from outdoor webcams and time-lapse videos. Through these applications, we show several settings where we can train state-of-the-art deep learning methods without manual annotations. This work focuses on three image synthesis tasks. We propose weakly supervised learning and remove requirements for manual annotations by relying on sequences of images. Our approach is in line with the research efforts that aim to minimize the labels required for training machine learning methods

    Pathways and challenges of the application of artificial intelligence to geohazards modelling

    Full text link
    © 2020 International Association for Gondwana Research The application of artificial intelligence (AI) and machine learning in geohazard modelling has been rapidly growing in recent years, a trend that is observed in several research and application areas thanks to recent advances in AI. As a result, the increasing dependence on data driven studies has made its practical applications towards geohazards (landslides, debris flows, earthquakes, droughts, floods, glacier studies) an interesting prospect. These aforementioned geohazards were responsible for roughly 80% of the economic loss in the past two decades caused by all natural hazards. The present study analyses the various domains of geohazards which have benefited from classical machine learning approaches and highlights the future course of direction in this field. The emergence of deep learning has fulfilled several gaps in: i) classification; ii) seasonal forecasting as well as forecasting at longer lead times; iii) temporal based change detection. Apart from the usual challenges of dataset availability, climate change and anthropogenic activities, this review paper emphasizes that the future studies should focus on consecutive events along with integration of physical models. The recent catastrophe in Japan and Australia makes a compelling argument to focus towards consecutive events. The availability of higher temporal resolution and multi-hazard dataset will prove to be essential, but the key would be to integrate it with physical models which would improve our understanding of the mechanism involved both in single and consecutive hazard scenario. Geohazards would eventually be a data problem, like geosciences, and therefore it is essential to develop models that would be capable of handling large voluminous data. The future works should also revolve towards interpretable models with the hope of providing a reasonable explanation of the results, thereby achieving the ultimate goal of Explainable AI

    Monument Monitor: using citizen science to preserve heritage

    Get PDF
    This research demonstrates how data collected by citizen scientists can act as a valuable resource for heritage managers. It establishes to what extent visitors’ photographs can be used to assist in aspects of condition monitoring focusing on biological and plant growth, erosion, stone/mortar movement, water ingress/pooling and antisocial behaviour. This thesis describes the methodology and outcomes of Monument Monitor (MM), a project set up in collaboration with Historic Environment Scotland (HES) that requested visitors at selected Scottish heritage sites to submit photographs of their visit. Across twenty case study sites participants were asked to record evidence of a variety of conservation issues. Patterns of contributions to the project are presented alongside key stakeholder feedback, which show how MM was received and where data collection excelled. Alongside this, the software built to manage and sort submissions is presented as a scalable methodology for the collection of citizen generated data of heritage sites. To demonstrate the applicability of citizen generated data for in depth monitoring and analysis, an environmental model is created using the submissions from one case study which predicts the effect of the changing climate at the site between 1980 - 2080. Machine Learning (ML) is used to analyse submitted data in both classification and segmentation tasks. This application demonstrates the validity of utilising ML tools to assist in the analysis and categorising of volunteer submitted photographs. The outcome of this PhD is a scalable methodology with which conservation staff can use visitor submitted images as an evidence-base to support them in the management of heritage sites

    PERICLES Deliverable 4.3:Content Semantics and Use Context Analysis Techniques

    Get PDF
    The current deliverable summarises the work conducted within task T4.3 of WP4, focusing on the extraction and the subsequent analysis of semantic information from digital content, which is imperative for its preservability. More specifically, the deliverable defines content semantic information from a visual and textual perspective, explains how this information can be exploited in long-term digital preservation and proposes novel approaches for extracting this information in a scalable manner. Additionally, the deliverable discusses novel techniques for retrieving and analysing the context of use of digital objects. Although this topic has not been extensively studied by existing literature, we believe use context is vital in augmenting the semantic information and maintaining the usability and preservability of the digital objects, as well as their ability to be accurately interpreted as initially intended.PERICLE

    2019 EC3 July 10-12, 2019 Chania, Crete, Greece

    Get PDF

    LOCATION MENTION PREDICTION FROM DISASTER TWEETS

    Get PDF
    While utilizing Twitter data for crisis management is of interest to different response authorities, a critical challenge that hinders the utilization of such data is the scarcity of automated tools that extract and resolve geolocation information. This dissertation focuses on the Location Mention Prediction (LMP) problem that consists of Location Mention Recognition (LMR) and Location Mention Disambiguation (LMD) tasks. Our work contributes to studying two main factors that influence the robustness of LMP systems: (i) the dataset used to train the model, and (ii) the learning model. As for the training dataset, we study the best training and evaluation strategies to exploit existing datasets and tools at the onset of disaster events. We emphasize that the size of training data matters and recommend considering the data domain, the disaster domain, and geographical proximity when training LMR models. We further construct the public IDRISI datasets, the largest to date English and first Arabic datasets for the LMP tasks. Rigorous analysis and experiments show that the IDRISI datasets are diverse, and domain and geographically generalizable, compared to existing datasets. As for the learning models, the LMP tasks are understudied in the disaster management domain. To address this, we reformulate the LMR and LMD modeling and evaluation to better suit the requirements of the response authorities. Moreover, we introduce competitive and state-of-the-art LMR and LMD models that are compared against a representative set of baselines for both Arabic and English languages

    Computational socioeconomics

    Get PDF
    Uncovering the structure of socioeconomic systems and timely estimation of socioeconomic status are significant for economic development. The understanding of socioeconomic processes provides foundations to quantify global economic development, to map regional industrial structure, and to infer individual socioeconomic status. In this review, we will make a brief manifesto about a new interdisciplinary research field named Computational Socioeconomics, followed by detailed introduction about data resources, computational tools, data-driven methods, theoretical models and novel applications at multiple resolutions, including the quantification of global economic inequality and complexity, the map of regional industrial structure and urban perception, the estimation of individual socioeconomic status and demographic, and the real-time monitoring of emergent events. This review, together with pioneering works we have highlighted, will draw increasing interdisciplinary attentions and induce a methodological shift in future socioeconomic studies

    Urban Informatics

    Get PDF
    This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently – to become ‘smart’ and ‘sustainable’. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ‘big’ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity
    corecore