11,024 research outputs found
Automated Classification of Airborne Laser Scanning Point Clouds
Making sense of the physical world has always been at the core of mapping. Up
until recently, this has always dependent on using the human eye. Using
airborne lasers, it has become possible to quickly "see" more of the world in
many more dimensions. The resulting enormous point clouds serve as data sources
for applications far beyond the original mapping purposes ranging from flooding
protection and forestry to threat mitigation. In order to process these large
quantities of data, novel methods are required. In this contribution, we
develop models to automatically classify ground cover and soil types. Using the
logic of machine learning, we critically review the advantages of supervised
and unsupervised methods. Focusing on decision trees, we improve accuracy by
including beam vector components and using a genetic algorithm. We find that
our approach delivers consistently high quality classifications, surpassing
classical methods
A Diagram Is Worth A Dozen Images
Diagrams are common tools for representing complex concepts, relationships
and events, often when it would be difficult to portray the same information
with natural images. Understanding natural images has been extensively studied
in computer vision, while diagram understanding has received little attention.
In this paper, we study the problem of diagram interpretation and reasoning,
the challenging task of identifying the structure of a diagram and the
semantics of its constituents and their relationships. We introduce Diagram
Parse Graphs (DPG) as our representation to model the structure of diagrams. We
define syntactic parsing of diagrams as learning to infer DPGs for diagrams and
study semantic interpretation and reasoning of diagrams in the context of
diagram question answering. We devise an LSTM-based method for syntactic
parsing of diagrams and introduce a DPG-based attention model for diagram
question answering. We compile a new dataset of diagrams with exhaustive
annotations of constituents and relationships for over 5,000 diagrams and
15,000 questions and answers. Our results show the significance of our models
for syntactic parsing and question answering in diagrams using DPGs
Geoadditive Regression Modeling of Stream Biological Condition
Indices of biotic integrity (IBI) have become an established tool to quantify the condition of small non-tidal streams and their watersheds. To investigate the effects of watershed characteristics on stream biological condition, we present a new technique for regressing IBIs on watershed-specific explanatory variables. Since IBIs are typically evaluated on anordinal scale, our method is based on the proportional odds model for ordinal outcomes. To avoid overfitting, we do not use classical maximum likelihood estimation but a component-wise functional gradient boosting approach. Because component-wise gradient boosting has an intrinsic mechanism for variable selection and model choice, determinants of biotic integrity can be identified. In addition, the method offers a relatively simple way to account for spatial correlation in ecological data. An analysis of the Maryland Biological Streams Survey shows that nonlinear effects of predictor variables on stream condition can be quantified while, in addition, accurate predictions of biological condition at unsurveyed locations are obtained
Analyzing Twitter Feeds to Facilitate Crises Informatics and Disaster Response During Mass Emergencies
It is a common practice these days for general public to use various micro-blogging platforms, predominantly Twitter, to share ideas, opinions and information about things and life. Twitter is also being increasingly used as a popular source of information sharing during natural disasters and mass emergencies to update and communicate the extent of the geographic phenomena, report the affected population and casualties, request or provide volunteering services and to share the status of disaster recovery process initiated by humanitarian-aid and disaster-management organizations. Recent research in this area has affirmed the potential use of such social media data for various disaster response tasks. Even though the availability of social media data is massive, open and free, there is a significant limitation in making sense of this data because of its high volume, variety, velocity, value, variability and veracity. The current work provides a comprehensive framework of text processing and analysis performed on several thousands of tweets shared on Twitter during natural disaster events. Specifically, this work em- ploys state-of-the-art machine learning techniques from natural language processing on tweet content to process the ginormous data generated at the time of disasters. This study shall serve as a basis to provide useful actionable information to the crises management and mitigation teams in planning and preparation of effective disaster response and to facilitate the development of future automated systems for handling crises situations
Analysing spatial patterns of coastal cultural ecosystem services using Flickr and Wikipedia data
The world’s coasts provide many ecosystem services that benefit human well-being. Coastal ecosystems are particularly important as a third of humanity lives within 100 kilometres of the coast. They supply provisioning, regulating, supporting and cultural services to sustain human livelihood. Cultural Ecosystem Services (CES) are immaterial services related to recreational, inspirational and social values that humans can benefit from. CES are very abundant at the coast and have increasingly been subject to research in the past years. In order to properly manage and protect ecosystems, it is important to be aware of which types of CES are provided and where they are located. To study the geographies of these services at the coast and inland, laborious PPGIS field studies have been conducted, that are usually limited in spatial and temporal scale. In recent years it became common to research CES with the analysis of Volunteered Geographic Data. The most popular source for these studies is the photo-sharing platform Flickr, as it offers easy and free access, contains abundant information about CES and allows users to geographically reference their content. The information conveyed by the image and its metadata (titles, tags, descriptions) have thus been used to differentiate and spatially analyse various types of CES.
Dependence on a single source of data introduces different kinds of bias that may misrepresent perspectives of certain demographics and their posting behaviour. Including a secondary data source for reference could help to complement the perspectives of Flickr data and visualise where these sources agree or disagree. Wikipedia offers a large repository of geolocated text data that could provide valuable information about CES across the landscape. This source has rarely been considered in research so far and thus this study aims to find out if Wikipedia is a suitable resource for spatial research on CES by comparing it to Flickr data.
The initial objective was to automatically classify Flickr posts with the CES class they represent, on a large scale along the entire East Coast of Britain. This was achieved using a Random Forest machine learning algorithm that uses the user-assigned tags to allot one of three CES classes selected for this study (Landscape Appreciation, Historical Monuments and Nature Appreciation). Using a limited set of variables, it was be demonstrated that a fairly accurate prediction can be achieved with this method. The main objective of this study is to compare the information content relating to CES and the spatial patterns between the Flickr classification and the Wikipedia article data set. The results were mixed and contained some uncertainties regarding data quality. Matching terms and concepts could be found in Wikipedia and Flickr data and there was also some spatial overlap between the two data sources. There was also a correlation between Wikipedia articles containing relevant information for a CES and the amount of related Flickr posts in the vicinity. However, the large differences in sample size and the ambiguous spatial representation of Wikipedia articles introduce significant uncertainty to the results.
As a secondary objective, the Flickr classification was visually assessed for significant spatial patterns and the observations compared against established literature. The spatial patterns agree with related research, as the data points are often concentrated close to accessible (cities, close to roads) and touristic places (e.g. castles, old towns). A large part of the posts are also located very close to the coastline, most strikingly the posts referring to landscapes. The co-occurrence of CES, so called bundles were studied as well by measuring the spatial correlation between the classes. There was a significant correlation between the posts of the classes Landscape Appreciation and Historical Monuments to be found
A Novel Hybrid Classification Approach for Sentiment Analysis of Text Document
Sentiment analysis is a more popular area of highly active research in Automatic Language Processing. She assigns a negative or positive polarity to one or more entities using different natural language processing tools and also predicted high and low performance of various sentiment classifiers. Our approach focuses on the analysis of feelings resulting from reviews of products using original text search techniques. These reviews can be classified as having a positive or negative feeling based on certain aspects in relation to a query based on terms. In this paper, we chose to use two automatic learning methods for classification: Support Vector Machines (SVM) and Random Forest, and we introduce a novel hybrid approach to identify product reviews offered by Amazon. This is useful for consumers who want to research the sentiment of products before purchase, or companies that want to monitor the public sentiment of their brands. The results summarize that the proposed method outperforms these individual classifiers in this amazon dataset
- …