5 research outputs found
Evaluating existing manually constructed natural landscape classification with a machine learning-based approach
Some landscape classifications officially determine financial obligations; thus, they must be objective and precise. We presume it is possible to quantitatively evaluate existing manually constructed classifications and correct them if necessary. One option for achieving this goal is a machine learning method. With (re)modeling of the landscape classification and an explanation of its structure, we can add quantitative proof to its original (qualitative) description. The main objectives of the paper are to evaluate the consistency of the existing manually constructed natural landscape classification with a machine learning-based approach and to test the newly developed general black-box explanation method in order to explain variable importance for the differentiation between natural landscape types. The approach consists of training a model of the existing classification and a general method for explaining variable importance. As an example, we evaluated the existing natural landscape classification of Slovenia from 1998, which is still officially used in the agricultural taxation process. Our results showed that the modeled classification confirms the original with a high rate of agreement--94%. The complementary map of classification uncertainty (entropy) gave us more information on the areas where the classification should be checked, and the analysis of the variable importance provided insight into the differentiation between types. Although the selection of the exclusively climatic variables seemed unusual at first, we were able to understand the computer\u27s logic and support geographical explanations for the model. We conclude that the approach can enhance the explanation and evaluation of natural landscape classifications and can be transparently transferred to other areas
Airbnb Valuation: A Machine Learning Approach
This thesis uses a geospatially-enhanced, machine learning approach to investigate variations in rental success on the peer-to-peer property sharing website Airbnb.com. Geographic factors, listing attributes and amenities, customer response metrics, and host attributes are included in decision tree modeling to predict the short-term probability of receiving a review. The most important variables in increasing model accuracy are assessed and variations in the importance of these variables investigated using Shapley values
Recommended from our members
Data-Driven Solutions to Bottlenecks in Natural Language Generation
Concept-to-text generation suffers from what can be called generation bottlenecks - aspects of the generated text which should change for different subject domains, and which are usually hard to obtain or require manual work. Some examples are domain-specific content, a type system, a dictionary, discourse style and lexical style. These bottlenecks have stifled attempts to create generation systems that are generic, or at least apply to a wide range of domains in non-trivial applications.
This thesis is comprised of two parts. In the first, we propose data-driven solutions that automate obtaining the information and models required to solve some of these bottlenecks. Specifically, we present an approach to mining domain-specific paraphrasal templates from a simple text corpus; an approach to extracting a domain-specific taxonomic thesaurus from Wikipedia; and a novel document planning model which determines both ordering and discourse relations, and which can be extracted from a domain corpus. We evaluate each solution individually and independently from its ultimate use in generation, and show significant improvements in each.
In the second part of the thesis, we describe a framework for creating generation systems that rely on these solutions, as well as on hybrid concept-to-text and text-to-text generation, and which can be automatically adapted to any domain using only a domain-specific corpus. We illustrate the breadth of applications that this framework applies to with three examples: biography generation and company description generation, which we use to evaluate the framework itself and the contribution of our solutions; and justification of machine learning predictions, a novel application which we evaluate in a task-based study to show its importance to users