9 research outputs found

    Air Quality Assessment from Social Media and Structured Data

    Get PDF
    This paper describes our work on mining pollutant data to assess air quality in urban areas. Notable aspects of this work are that we mine social media and structured data in a domain-specific context, incorporate commonsense knowledge in mining media opinions and focus on the urban planning domain in a multicity environment. The results of mining are useful for predictive analysis in urbanization. A significant contribution is that we provide useful information on urban health impact

    Mining Social Media and Structured Data in Urban Environmental Management to Develop Smart Cities

    Get PDF
    This research presented the deployment of data mining on social media and structured data in urban studies. We analyzed urban relocation, air quality and traffic parameters on multicity data as early work. We applied the data mining techniques of association rules, clustering and classification on urban legislative history. Results showed that data mining could produce meaningful knowledge to support urban management. We treated ordinances (local laws) and the tweets about them as indicators to assess urban policy and public opinion. Hence, we conducted ordinance and tweet mining including sentiment analysis of tweets. This part of the study focused on NYC with a goal of assessing how well it heads towards a smart city. We built domain-specific knowledge bases according to widely accepted smart city characteristics, incorporating commonsense knowledge sources for ordinance-tweet mapping. We developed decision support tools on multiple platforms using the knowledge discovered to guide urban management. Our research is a concrete step in harnessing the power of data mining in urban studies to enhance smart city development

    Projecting land use changes using parcel-level data : model development and application to Hunterdon County, New Jersey

    Get PDF
    This dissertation is to develop a parcel-based spatial land use change prediction model by coupling various machine learning and interpretation algorithms such as cellular automata (CA) and decision tree (DT). CA is a collection of cells that evolves through a number of discrete time steps according to a set of transition rules based on the state of each cell and the characteristics of its neighboring cells. DT is a data mining and machine learning tool that extracts the patterns of decision process from observed cell behaviors and their affecting factors. In this dissertation, CA is used to predict the future land use status of cadastral parcels based on a set of transition rules derived from a set of identified land use change driving factors using DT. Although CA and DT have been applied separately in various land use change models in the literature, no studies attempted to integrate them. This DT-based CA model developed in this dissertation represents the first kind of such integration in land use change modeling. The coupled model would be able to handle a large set of driving factors and also avoid subjective bias when deriving the transition rules. The coupled model uses the cadastral parcel as a unit of analysis, which has practical policy implications because the responses of land use changes to various policy usually take place at the parcel level. Since parcel varies by their sizes and shapes, its use as a unit of analysis does make it difficult to apply CA, which initially designed to handle regular grid cells. This dissertation improves the treatment of the irregular cell in CA-based land use change models in literature by defining a cell\u27s neighborhood as a fixed distance buffer along the parcel boundary. The DT-based CA model was developed and validated in Hunterdon County, New Jersey. The data on historical land uses and various land use change driving factors for Hunterdon County were collected and processed using a Geographic Information System (GIS). Specifically, the county land uses in 1986, I995 and 2002 were overlaid with a parcel map to create parcel-based land use maps. The single land use in each parcel is based on a classification scheme developed thorough literature review and empirical testing in the study area. The possible land use status considered for each parcel is agriculture, barren land, forest, urban, water or wetlands following the land use/land cover classification by the New Jersey Department of Environment Protection. The identified driving factors for the future status of the parcel includes the present land use type, the number of soil restrictions to urban development, and the size of the parcel, the amount of wetlands within the parcel, the distribution of land uses in the neighborhood of the parcel, the distances to the nearest streams, urban centers and major roads. A set of transition rules illustrating the land use change processes during the period 1986-1995 were developed using a TD software J48 Classifier. The derived transition rules were applied to the 1995 land use data in a CA model Agent Analyst/RePast (Recursive Porous Agent Simulation Toolkit) to predict the spatial land use pattern in 2004, which were then validated by the actual land use map in 2002. The DT-based CA model had an overall accuracy of 84.46 percent in terms of the number of parcels and of 80.92 percent in terms of the total acreage in predicting land use changes. The model shows much higher capacity in predicting the quantitative changes than the locational changes in land use. The validated model was applied to simulate the 2011 land use patterns in Hunterdon County based on its actual land uses in 2002 under both business as usual and policy scenarios. The simulation results shows that successfully implementing current land use policies such as down zoning, open space and farmland preservation would prevent the total of 7,053 acres (741 acres of wetlands, 3,034 acres of agricultural lands, 250 acres of barren land, and 3,028 acres of forest) from future urban development in Hunterdon County during the period 2002-2011. The neighborhood of a parcel was defined by a 475-foot buffer along the parcel boundary in the study. The results of sensitivity analyses using two additional neighborhoods (237- and 712-foot buffers) indicate the insignificant impacts of the neighborhood size on the model outputs in this application

    Dynamic land use/cover change modelling

    Get PDF
    Landnutzungswandel ist eine komplexe Angelegenheit, die durch zahlreiche biophysikalische, sozioökonomische und wirtschaftliche Faktoren verursacht wird. Eine offensichtliche Art des Landnutzungswandels, die in den suburbanen Gebieten einer Metropole stattfindet, ist die Zersiedelung. Es gibt viele Modellierungstechniken, um dieses Phänomen zu studieren. Diese wurden seit den 1960iger Jahren entwickelt und finden weite Verbreitung. Einige dieser Modelle leiden unter dem Vernachlässigen signifikanter Variablen. Traditionelle Methoden wie etwa zellulare Automaten, Markow-Ketten-Modelle, zellulare Automaten-Markow-Modelle und logistische Regressionsmodelle, weisen inhärente Schwächen auf in Bezug auf menschliche Aktivitäten in der Umwelt. Das liegt daran, dass der Mensch der Hauptakteur in der Transformation der Umwelt ist und die suburbanen Gebiete durch Niederlassungspräferenzen und Lebensstil prägt. Das Hauptziel dieser Dissertation ist es, einige dieser traditionellen Techniken zu untersuchen, um ihre Vor- und Nachteile zu identifizieren. Diese Modelle werden miteinander verglichen, um ihre Funktionalität zu hinterfragen. Obwohl die Methodologie zur Evaluierung agentenbasierter Modelle unzureichend ist, wurde hier versucht, ein selbst-kalibriertes agentenbasiertes Modell für den Großraum Teheran zu erstellen. Einige Variablen, die in der Wirklichkeit die Zersiedelung im Studiengebiet kontrollieren, wurden durch Expertenwissen und ähnliche Studien extrahiert. Drei Hauptagenten, die mit der Ausbreitung von Städten zu tun haben, wurden definiert: Entwickler, Bewohner, Behörden. Jeder einzelne Agent beeinflusst Variablen; d.h. die Entscheidungen eines Agenten werden von einer Reihe realer Variablen beeinflusst. Das Verhalten der einzelnen Agenten wurde in einer GIS Umgebung kodiert und anschließend zusammengeführt, um einen Prototyp zur Simulation der Landnutzungsänderung zu erzeugen. Dieser Geosimulations-Prototyp ist in der Lage, die Quantität und die Lage von Landnutzungsänderungen insbesondere in der Umgebung von Teheran zu simulieren. Dieses agentenbasierte Modell zieht Nutzen aus der Stärke traditioneller Techniken wie etwa zellularen Automaten zur Änderungsallokation, Markow-Modellen zur Schätzung der Quantität der Änderung und einer Gewichtung der individuellen Faktoren. Eine detaillierte Diskussion der Implementierung der unterschiedlichen Methoden sowie eine Stärken-Schwächen-Analyse werden präsentiert und die Ergebnisse mit der tatsächlichen Situation verglichen, um die Modelle zu verifizieren. In dieser Arbeit wurden GIS Funktionen verwendet und zusätzliche Funktionen in Python programmiert. Diese Untersuchungen sollen Stadtplaner und Entscheidungsträger unterstützen, Städte und deren Ausbreitung zu simulieren.Land use/ cover change is a complex matter, which is caused by numerous biophysical, socio-economical and economic factors. An obvious form of land use change in the suburbs of the metropolis is defined as urban sprawl. There are a number of techniques to model this issue in order to investigate this topic. These models have been developed since the 1960s and are increasing in terms of quantity and popularity. Some of these models suffer from a lack of consideration of some significant variables. The traditional methods (e.g. Cellular Automata, the Markov Chain Model, the CA-Markov Model, and the Logistic Regression Model) have some inherent weaknesses in consideration of human activity in the environment. The particular significance of this problem is the fact that humans are the main actors in the transformation of the environment, and impact upon the suburbs due to their settlement preferences and lifestyle choices. The main aim of this thesis was to examine some of those traditional techniques in order to discover their considerable advantages and disadvantages. These models were compared against each other to challenge their functionality. Whereas there is a lack of methodology in evaluation of agent-based models, it was presumed to create a self-calibrated agent based model, by focussing on the Tehran metropolitan area. Some variables in reality control urban sprawl in the study area, which were extracted through the expert knowledge and similar studies. Three main agents, which deal with urban expansion, were defined: developers, residents, government. Each particular agent affects some variables, i.e. the agents‟ decisions are being influenced by a set of real variables. Agents‟ behaviours were coded in a GIS environment and, thereafter, the predefined agents were combined through a function to create a prototype for simulation of land change. This designed geosimulation prototype can simulate the quantity and location of changes specifically in the vicinity of the metropolis of Tehran. This customised agent-based model benefits from the strengths of traditional techniques; for instance, a Cellular Automata structure for change allocation, a Markov model for change quantity estimation and a weighting system to differentiate between the weights of the driving factors. A detailed discussion of each methodology implementation, and their weakness and strengths, is then presented, specifically comparing results with the reality to verify the models. In this research, we used only the GIS functionalities within GIS environments and the required functions were coded in the Python engine. This investigation will help urban planners and urban decision-makers to simulate cities and their movements over time

    Data mining of cellular automata's transition rules

    No full text
    This paper presents a new method to discover knowledge for geographical cellular automata (CA) by using a data-mining technique. CA have the ability to simulate complex geographical phenomena. Very few studies have been carried out on how to determine and validate the transition rules of CA from observed data. The transition rules of traditional CA are usually expressed by mathematical equations. This paper demonstrates that the explicit transition rules of CA can be automatically reconstructed through the rule induction procedure of data mining. The explicit transition rules are more intuitive to decision-makers. The transition rules are obtained by applying data-mining techniques to spatial data. The proposed method can reduce the uncertainties in defining transition rules and help to generate more reliable simulation results. © 2004 Taylor & Francis Ltd.link_to_subscribed_fulltex

    Data mining of cellular automata's transition rules

    Get PDF
    This paper presents a new method to discover knowledge for geographical cellular automata (CA) by using a data-mining technique. CA have the ability to simulate complex geographical phenomena. Very few studies have been carried out on how to determine and validate the transition rules of CA from observed data. The transition rules of traditional CA are usually expressed by mathematical equations. This paper demonstrates that the explicit transition rules of CA can be automatically reconstructed through the rule induction procedure of data mining. The explicit transition rules are more intuitive to decision-makers. The transition rules are obtained by applying data-mining techniques to spatial data. The proposed method can reduce the uncertainties in defining transition rules and help to generate more reliable simulation results. © 2004 Taylor & Francis Ltd.link_to_subscribed_fulltex
    corecore