11,024 research outputs found

    Automated Classification of Airborne Laser Scanning Point Clouds

    Full text link
    Making sense of the physical world has always been at the core of mapping. Up until recently, this has always dependent on using the human eye. Using airborne lasers, it has become possible to quickly "see" more of the world in many more dimensions. The resulting enormous point clouds serve as data sources for applications far beyond the original mapping purposes ranging from flooding protection and forestry to threat mitigation. In order to process these large quantities of data, novel methods are required. In this contribution, we develop models to automatically classify ground cover and soil types. Using the logic of machine learning, we critically review the advantages of supervised and unsupervised methods. Focusing on decision trees, we improve accuracy by including beam vector components and using a genetic algorithm. We find that our approach delivers consistently high quality classifications, surpassing classical methods

    A Diagram Is Worth A Dozen Images

    Full text link
    Diagrams are common tools for representing complex concepts, relationships and events, often when it would be difficult to portray the same information with natural images. Understanding natural images has been extensively studied in computer vision, while diagram understanding has received little attention. In this paper, we study the problem of diagram interpretation and reasoning, the challenging task of identifying the structure of a diagram and the semantics of its constituents and their relationships. We introduce Diagram Parse Graphs (DPG) as our representation to model the structure of diagrams. We define syntactic parsing of diagrams as learning to infer DPGs for diagrams and study semantic interpretation and reasoning of diagrams in the context of diagram question answering. We devise an LSTM-based method for syntactic parsing of diagrams and introduce a DPG-based attention model for diagram question answering. We compile a new dataset of diagrams with exhaustive annotations of constituents and relationships for over 5,000 diagrams and 15,000 questions and answers. Our results show the significance of our models for syntactic parsing and question answering in diagrams using DPGs

    Geoadditive Regression Modeling of Stream Biological Condition

    Get PDF
    Indices of biotic integrity (IBI) have become an established tool to quantify the condition of small non-tidal streams and their watersheds. To investigate the effects of watershed characteristics on stream biological condition, we present a new technique for regressing IBIs on watershed-specific explanatory variables. Since IBIs are typically evaluated on anordinal scale, our method is based on the proportional odds model for ordinal outcomes. To avoid overfitting, we do not use classical maximum likelihood estimation but a component-wise functional gradient boosting approach. Because component-wise gradient boosting has an intrinsic mechanism for variable selection and model choice, determinants of biotic integrity can be identified. In addition, the method offers a relatively simple way to account for spatial correlation in ecological data. An analysis of the Maryland Biological Streams Survey shows that nonlinear effects of predictor variables on stream condition can be quantified while, in addition, accurate predictions of biological condition at unsurveyed locations are obtained

    Analyzing Twitter Feeds to Facilitate Crises Informatics and Disaster Response During Mass Emergencies

    Get PDF
    It is a common practice these days for general public to use various micro-blogging platforms, predominantly Twitter, to share ideas, opinions and information about things and life. Twitter is also being increasingly used as a popular source of information sharing during natural disasters and mass emergencies to update and communicate the extent of the geographic phenomena, report the affected population and casualties, request or provide volunteering services and to share the status of disaster recovery process initiated by humanitarian-aid and disaster-management organizations. Recent research in this area has affirmed the potential use of such social media data for various disaster response tasks. Even though the availability of social media data is massive, open and free, there is a significant limitation in making sense of this data because of its high volume, variety, velocity, value, variability and veracity. The current work provides a comprehensive framework of text processing and analysis performed on several thousands of tweets shared on Twitter during natural disaster events. Specifically, this work em- ploys state-of-the-art machine learning techniques from natural language processing on tweet content to process the ginormous data generated at the time of disasters. This study shall serve as a basis to provide useful actionable information to the crises management and mitigation teams in planning and preparation of effective disaster response and to facilitate the development of future automated systems for handling crises situations

    Analysing spatial patterns of coastal cultural ecosystem services using Flickr and Wikipedia data

    Get PDF
    The world’s coasts provide many ecosystem services that benefit human well-being. Coastal ecosystems are particularly important as a third of humanity lives within 100 kilometres of the coast. They supply provisioning, regulating, supporting and cultural services to sustain human livelihood. Cultural Ecosystem Services (CES) are immaterial services related to recreational, inspirational and social values that humans can benefit from. CES are very abundant at the coast and have increasingly been subject to research in the past years. In order to properly manage and protect ecosystems, it is important to be aware of which types of CES are provided and where they are located. To study the geographies of these services at the coast and inland, laborious PPGIS field studies have been conducted, that are usually limited in spatial and temporal scale. In recent years it became common to research CES with the analysis of Volunteered Geographic Data. The most popular source for these studies is the photo-sharing platform Flickr, as it offers easy and free access, contains abundant information about CES and allows users to geographically reference their content. The information conveyed by the image and its metadata (titles, tags, descriptions) have thus been used to differentiate and spatially analyse various types of CES. Dependence on a single source of data introduces different kinds of bias that may misrepresent perspectives of certain demographics and their posting behaviour. Including a secondary data source for reference could help to complement the perspectives of Flickr data and visualise where these sources agree or disagree. Wikipedia offers a large repository of geolocated text data that could provide valuable information about CES across the landscape. This source has rarely been considered in research so far and thus this study aims to find out if Wikipedia is a suitable resource for spatial research on CES by comparing it to Flickr data. The initial objective was to automatically classify Flickr posts with the CES class they represent, on a large scale along the entire East Coast of Britain. This was achieved using a Random Forest machine learning algorithm that uses the user-assigned tags to allot one of three CES classes selected for this study (Landscape Appreciation, Historical Monuments and Nature Appreciation). Using a limited set of variables, it was be demonstrated that a fairly accurate prediction can be achieved with this method. The main objective of this study is to compare the information content relating to CES and the spatial patterns between the Flickr classification and the Wikipedia article data set. The results were mixed and contained some uncertainties regarding data quality. Matching terms and concepts could be found in Wikipedia and Flickr data and there was also some spatial overlap between the two data sources. There was also a correlation between Wikipedia articles containing relevant information for a CES and the amount of related Flickr posts in the vicinity. However, the large differences in sample size and the ambiguous spatial representation of Wikipedia articles introduce significant uncertainty to the results. As a secondary objective, the Flickr classification was visually assessed for significant spatial patterns and the observations compared against established literature. The spatial patterns agree with related research, as the data points are often concentrated close to accessible (cities, close to roads) and touristic places (e.g. castles, old towns). A large part of the posts are also located very close to the coastline, most strikingly the posts referring to landscapes. The co-occurrence of CES, so called bundles were studied as well by measuring the spatial correlation between the classes. There was a significant correlation between the posts of the classes Landscape Appreciation and Historical Monuments to be found

    A Novel Hybrid Classification Approach for Sentiment Analysis of Text Document

    Get PDF
    Sentiment analysis is a more popular area of highly active research in Automatic Language Processing. She assigns a negative or positive polarity to one or more entities using different natural language processing tools and also predicted high and low performance of various sentiment classifiers. Our approach focuses on the analysis of feelings resulting from reviews of products using original text search techniques. These reviews can be classified as having a positive or negative feeling based on certain aspects in relation to a query based on terms. In this paper, we chose to use two automatic learning methods for classification: Support Vector Machines (SVM) and Random Forest, and we introduce a novel hybrid approach to identify product reviews offered by Amazon. This is useful for consumers who want to research the sentiment of products before purchase, or companies that want to monitor the public sentiment of their brands. The results summarize that the proposed method outperforms these individual classifiers in this amazon dataset
    corecore