3,184 research outputs found

    Detecting natural disasters, damage, and incidents in the wild

    Full text link
    Responding to natural disasters, such as earthquakes, floods, and wildfires, is a laborious task performed by on-the-ground emergency responders and analysts. Social media has emerged as a low-latency data source to quickly understand disaster situations. While most studies on social media are limited to text, images offer more information for understanding disaster and incident scenes. However, no large-scale image datasets for incident detection exists. In this work, we present the Incidents Dataset, which contains 446,684 images annotated by humans that cover 43 incidents across a variety of scenes. We employ a baseline classification model that mitigates false-positive errors and we perform image filtering experiments on millions of social media images from Flickr and Twitter. Through these experiments, we show how the Incidents Dataset can be used to detect images with incidents in the wild. Code, data, and models are available online at http://incidentsdataset.csail.mit.edu.Comment: ECCV 202

    Mining Twitter for crisis management: realtime floods detection in the Arabian Peninsula

    Get PDF
    A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of doctor of Philosophy.In recent years, large amounts of data have been made available on microblog platforms such as Twitter, however, it is difficult to filter and extract information and knowledge from such data because of the high volume, including noisy data. On Twitter, the general public are able to report real-world events such as floods in real time, and act as social sensors. Consequently, it is beneficial to have a method that can detect flood events automatically in real time to help governmental authorities, such as crisis management authorities, to detect the event and make decisions during the early stages of the event. This thesis proposes a real time flood detection system by mining Arabic Tweets using machine learning and data mining techniques. The proposed system comprises five main components: data collection, pre-processing, flooding event extract, location inferring, location named entity link, and flooding event visualisation. An effective method of flood detection from Arabic tweets is presented and evaluated by using supervised learning techniques. Furthermore, this work presents a location named entity inferring method based on the Learning to Search method, the results show that the proposed method outperformed the existing systems with significantly higher accuracy in tasks of inferring flood locations from tweets which are written in colloquial Arabic. For the location named entity link, a method has been designed by utilising Google API services as a knowledge base to extract accurate geocode coordinates that are associated with location named entities mentioned in tweets. The results show that the proposed location link method locate 56.8% of tweets with a distance range of 0 – 10 km from the actual location. Further analysis has shown that the accuracy in locating tweets in an actual city and region are 78.9% and 84.2% respectively

    ARSTREAM: A Neural Network Model of Auditory Scene Analysis and Source Segregation

    Full text link
    Multiple sound sources often contain harmonics that overlap and may be degraded by environmental noise. The auditory system is capable of teasing apart these sources into distinct mental objects, or streams. Such an "auditory scene analysis" enables the brain to solve the cocktail party problem. A neural network model of auditory scene analysis, called the AIRSTREAM model, is presented to propose how the brain accomplishes this feat. The model clarifies how the frequency components that correspond to a give acoustic source may be coherently grouped together into distinct streams based on pitch and spatial cues. The model also clarifies how multiple streams may be distinguishes and seperated by the brain. Streams are formed as spectral-pitch resonances that emerge through feedback interactions between frequency-specific spectral representaion of a sound source and its pitch. First, the model transforms a sound into a spatial pattern of frequency-specific activation across a spectral stream layer. The sound has multiple parallel representations at this layer. A sound's spectral representation activates a bottom-up filter that is sensitive to harmonics of the sound's pitch. The filter activates a pitch category which, in turn, activate a top-down expectation that allows one voice or instrument to be tracked through a noisy multiple source environment. Spectral components are suppressed if they do not match harmonics of the top-down expectation that is read-out by the selected pitch, thereby allowing another stream to capture these components, as in the "old-plus-new-heuristic" of Bregman. Multiple simultaneously occuring spectral-pitch resonances can hereby emerge. These resonance and matching mechanisms are specialized versions of Adaptive Resonance Theory, or ART, which clarifies how pitch representations can self-organize durin learning of harmonic bottom-up filters and top-down expectations. The model also clarifies how spatial location cues can help to disambiguate two sources with similar spectral cures. Data are simulated from psychophysical grouping experiments, such as how a tone sweeping upwards in frequency creates a bounce percept by grouping with a downward sweeping tone due to proximity in frequency, even if noise replaces the tones at their interection point. Illusory auditory percepts are also simulated, such as the auditory continuity illusion of a tone continuing through a noise burst even if the tone is not present during the noise, and the scale illusion of Deutsch whereby downward and upward scales presented alternately to the two ears are regrouped based on frequency proximity, leading to a bounce percept. Since related sorts of resonances have been used to quantitatively simulate psychophysical data about speech perception, the model strengthens the hypothesis the ART-like mechanisms are used at multiple levels of the auditory system. Proposals for developing the model to explain more complex streaming data are also provided.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-92-J-0225); Office of Naval Research (N00014-01-1-0624); Advanced Research Projects Agency (N00014-92-J-4015); British Petroleum (89A-1204); National Science Foundation (IRI-90-00530); American Society of Engineering Educatio

    ARSTREAM: A Neural Network Model of Auditory Scene Analysis and Source Segregation

    Full text link
    Multiple sound sources often contain harmonics that overlap and may be degraded by environmental noise. The auditory system is capable of teasing apart these sources into distinct mental objects, or streams. Such an "auditory scene analysis" enables the brain to solve the cocktail party problem. A neural network model of auditory scene analysis, called the AIRSTREAM model, is presented to propose how the brain accomplishes this feat. The model clarifies how the frequency components that correspond to a give acoustic source may be coherently grouped together into distinct streams based on pitch and spatial cues. The model also clarifies how multiple streams may be distinguishes and seperated by the brain. Streams are formed as spectral-pitch resonances that emerge through feedback interactions between frequency-specific spectral representaion of a sound source and its pitch. First, the model transforms a sound into a spatial pattern of frequency-specific activation across a spectral stream layer. The sound has multiple parallel representations at this layer. A sound's spectral representation activates a bottom-up filter that is sensitive to harmonics of the sound's pitch. The filter activates a pitch category which, in turn, activate a top-down expectation that allows one voice or instrument to be tracked through a noisy multiple source environment. Spectral components are suppressed if they do not match harmonics of the top-down expectation that is read-out by the selected pitch, thereby allowing another stream to capture these components, as in the "old-plus-new-heuristic" of Bregman. Multiple simultaneously occuring spectral-pitch resonances can hereby emerge. These resonance and matching mechanisms are specialized versions of Adaptive Resonance Theory, or ART, which clarifies how pitch representations can self-organize durin learning of harmonic bottom-up filters and top-down expectations. The model also clarifies how spatial location cues can help to disambiguate two sources with similar spectral cures. Data are simulated from psychophysical grouping experiments, such as how a tone sweeping upwards in frequency creates a bounce percept by grouping with a downward sweeping tone due to proximity in frequency, even if noise replaces the tones at their interection point. Illusory auditory percepts are also simulated, such as the auditory continuity illusion of a tone continuing through a noise burst even if the tone is not present during the noise, and the scale illusion of Deutsch whereby downward and upward scales presented alternately to the two ears are regrouped based on frequency proximity, leading to a bounce percept. Since related sorts of resonances have been used to quantitatively simulate psychophysical data about speech perception, the model strengthens the hypothesis the ART-like mechanisms are used at multiple levels of the auditory system. Proposals for developing the model to explain more complex streaming data are also provided.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-92-J-0225); Office of Naval Research (N00014-01-1-0624); Advanced Research Projects Agency (N00014-92-J-4015); British Petroleum (89A-1204); National Science Foundation (IRI-90-00530); American Society of Engineering Educatio

    A deep action-oriented video image classification system for text detection and recognition

    Get PDF
    For the video images with complex actions, achieving accurate text detection and recognition results is very challenging. This paper presents a hybrid model for classification of action-oriented video images which reduces the complexity of the problem to improve text detection and recognition performance. Here, we consider the following five categories of genres, namely concert, cooking, craft, teleshopping and yoga. For classifying action-oriented video images, we explore ResNet50 for learning the general pixel-distribution level information and the VGG16 network is implemented for learning the features of Maximally Stable Extremal Regions and again another VGG16 is used for learning facial components obtained by a multitask cascaded convolutional network. The approach integrates the outputs of the three above-mentioned models using a fully connected neural network for classification of five action-oriented image classes. We demonstrated the efficacy of the proposed method by testing on our dataset and two other standard datasets, namely, Scene Text Dataset dataset which contains 10 classes of scene images with text information, and the Stanford 40 Actions dataset which contains 40 action classes without text information. Our method outperforms the related existing work and enhances the class-specific performance of text detection and recognition, significantly

    Methods for Detecting Floodwater on Roadways from Ground Level Images

    Get PDF
    Recent research and statistics show that the frequency of flooding in the world has been increasing and impacting flood-prone communities severely. This natural disaster causes significant damages to human life and properties, inundates roads, overwhelms drainage systems, and disrupts essential services and economic activities. The focus of this dissertation is to use machine learning methods to automatically detect floodwater in images from ground level in support of the frequently impacted communities. The ground level images can be retrieved from multiple sources, including the ones that are taken by mobile phone cameras as communities record the state of their flooded streets. The model developed in this research processes these images in multiple levels. The first detection model investigates the presence of flood in images by developing and comparing image classifiers with various feature extractors. Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG), and pretrained convolutional neural networks are used as feature extractors. Then, decision trees, logistic regression, and K-Nearest Neighbors (K-NN) models are trained and tested for making predictions on floodwater presence in the image. Once the model detects flood in an image, it moves to the second layer to detect the presence of floodwater at a pixel level in each image. This pixel-level identification is achieved by semantic segmentation by using a super-pixel based prediction method and Fully Convolutional Neural Networks (FCNs). First, SLIC super-pixel method is used to create the super-pixels, then the same types of classifiers as the initial classification method are trained to predict the class of each super-pixel. Later, the FCN is trained end-to-end without any additional classifiers. Once these processes are done, images are segmented into regions of floodwater at pixel level. In both of the classification and semantic segmentation tasks, deep learning-based methods showed the best results. Once the model receives the confirmation of flood detection in image and pixel layers, it moves to the final task of finding the floodwater depth in images. This third and final layer of the model is critical as it can help officials deduce the severity of the flood at a given area. In order to detect the depth of the water and the severity of the flooding, the model processes the cars on streets that are in water and calculates the percentage of tires that are under water. This calculation is achieved with a mixture of deep learning and classical computer vision techniques. There are four main processes in this task: (i)-Semantic segmentation of the image into pixels that belong to background, floodwater, and wheels of vehicles. The segmentation is done by multiple FCN models that are trained with various base models. (ii)-Object detection models for detecting tires. The tires are identified by a You Only Look Once (YOLO) object detector. (iii)- Improvement of initial segmentation results. A U-Net like semantic segmentation network is proposed. It uses the tire patches from the object detector and the corresponding initial segmentation results, and it learns to fix the errors of the initial segmentation results. (iv)-Calculation of water depth as a ratio of the tire wheel under the water. This final task uses the improved segmentation results to identify the ellipses that correspond to the wheel parts of vehicles and utilizes two approaches listed below as part of a hybrid method: (i)-Using the improved segmentation results as they return the pixels belonging to the wheels. Boundaries of the wheels are found from this and used. (ii)-Finding arcs that belong to elliptical objects by applying a series of image processing methods. This method connects the arcs found to build larger structures such as two-piece (half ellipse), three-piece or four-piece (full) ellipses. Once the ellipse boundary is calculated using both methods, the ratio of the ellipse under floodwater can be calculated. This novel multi-model system allows us to attribute potential prediction errors to the different parts of the model such as semantic segmentation of the image or the calculation of the elliptical boundary. To verify the applicability of the proposed methods and to train the models, extensive hand-labeled datasets were created as part of this dissertation. The initial images were collected from the web, then the datasets were enriched by images created from virtual environments, simulations of neighborhoods under flood, using the Unity software. In conclusion, the proposed methods in this dissertation, as validated on the labeled datasets, can successfully classify images as a flood scene, semantically segment the regions of flood, and predict the depth of water to indicate severit

    Human Resource Management in Emergency Situations

    Get PDF
    The dissertation examines the issues related to the human resource management in emergency situations and introduces the measures helping to solve these issues. The prime aim is to analyse complexly a human resource management, built environment resilience management life cycle and its stages for the purpose of creating an effective Human Resource Management in Emergency Situations Model and Intelligent System. This would help in accelerating resilience in every stage, managing personal stress and reducing disaster-related losses. The dissertation consists of an Introduction, three Chapters, the Conclusions, References, List of Author’s Publications and nine Appendices. The introduction discusses the research problem and the research relevance, outlines the research object, states the research aim and objectives, overviews the research methodology and the original contribution of the research, presents the practical value of the research results, and lists the defended propositions. The introduction concludes with an overview of the author’s publications and conference presentations on the topic of this dissertation. Chapter 1 introduces best practice in the field of disaster and resilience management in the built environment. It also analyses disaster and resilience management life cycle ant its stages, reviews different intelligent decision support systems, and investigates researches on application of physiological parameters and their dependence on stress. The chapter ends with conclusions and the explicit objectives of the dissertation. Chapter 2 of the dissertation introduces the conceptual model of human resource management in emergency situations. To implement multiple criteria analysis of the research object the methods of multiple criteria analysis and mahematics are proposed. They should be integrated with intelligent technologies. In Chapter 3 the model developed by the author and the methods of multiple criteria analysis are adopted by developing the Intelligent Decision Support System for a Human Resource Management in Emergency Situations consisting of four subsystems: Physiological Advisory Subsystem to Analyse a User’s Post-Disaster Stress Management; Text Analytics Subsystem; Recommender Thermometer for Measuring the Preparedness for Resilience and Subsystem of Integrated Virtual and Intelligent Technologies. The main statements of the thesis were published in eleven scientific articles: two in journals listed in the Thomson Reuters ISI Web of Science, one in a peer-reviewed scientific journal, four in peer-reviewed conference proceedings referenced in the Thomson Reuters ISI database, and three in peer-reviewed conference proceedings in Lithuania. Five presentations were given on the topic of the dissertation at conferences in Lithuania and other countries

    TRECVID 2004 - an overview

    Get PDF

    Literature review of the remote sensing of natural resources

    Get PDF
    Abstracts of 596 documents related to remote sensors or the remote sensing of natural resources by satellite, aircraft, or ground-based stations are presented. Topics covered include general theory, geology and hydrology, agriculture and forestry, marine sciences, urban land use, and instrumentation. Recent documents not yet cited in any of the seven information sources used for the compilation are summarized. An author/key word index is provided
    corecore