2,101 research outputs found

    Classification Rules for Hotspot Occurrences Using Spatial Entropy-based Decision Tree Algorithm

    Get PDF
    AbstractForest fire is a state where forest affected by fire that led to forest damage and may cause disadvantages in human life. Forest fire event can be monitored using satellite by detecting hotspots as fire indicators at certain times and locations. The purpose of this work is to develop a decision tree to predict hotspot occurrences in Bengkalis district, Riau province Indonesia using the spatial entropy-based decision tree algorithm. The data used are forest fire data in Bengkalis area. The data include city centre, river, road, income source, land cover, population, precipitation, school, temperature, and wind speed. The results of this work using the 5-fold cross validation test are decision trees with the average accuracy of 89.04% on the training set and 52.05% on the testing set. The tree has 560 nodes with the land cover layer as the root node. From the decision tree, as many 255 rules were obtained to classify hotspot occurrences

    Crime prediction and monitoring in Porto, Portugal, using machine learning, spatial and text analytics

    Get PDF
    Crimes are a common societal concern impacting quality of life and economic growth. Despite the global decrease in crime statistics, specific types of crime and feelings of insecurity, have often increased, leading safety and security agencies with the need to apply novel approaches and advanced systems to better predict and prevent occurrences. The use of geospatial technologies, combined with data mining and machine learning techniques allows for significant advances in the criminology of place. In this study, official police data from Porto, in Portugal, between 2016 and 2018, was georeferenced and treated using spatial analysis methods, which allowed the identification of spatial patterns and relevant hotspots. Then, machine learning processes were applied for space-time pattern mining. Using lasso regression analysis, significance for crime variables were found, with random forest and decision tree supporting the important variable selection. Lastly, tweets related to insecurity were collected and topic modeling and sentiment analysis was performed. Together, these methods assist interpretation of patterns, prediction and ultimately, performance of both police and planning professionals

    Burn Area Processing to Generate False Alarm Data for Hotspot Prediction Models

    Get PDF
    Developing hotspot prediction models using decision tree algorithms require target classes to which objects in a dataset are classified.  In modeling hotspots occurrence, target classes are the true class representing hotspots occurrence and the false class indicating non hotspots occurrence.  This paper presents the results of satellite image processing in order to determine the radius of a hotspot such that random points are generated outside a hotspot buffer as false alarm data.  Clustering and majority filtering were performed on the Landsat TM image to extract burn scars in the study area i.e. Rokan Hilir, Riau Province Indonesia.  Calculation on burn areas and FIRMS MODIS fire/hotspots in 2006 results the radius of a hotspot 0.90737 km.  Therefore, non-hotspots were randomly generated in areas that are located 0.90737 km away from a hotspot. Three decision tree algorithms i.e. ID3, C4.5 and extended spatial ID3 have been applied on a dataset containing 235 objects that have the true class and 326 objects that have the false class. The results are decision trees for modeling hotspots occurrence which have the accuracy of 49.02% for the ID3 decision tree, 65.24% for the C4.5 decision tree, and 71.66% for the extended spatial ID3 decision tree

    Classification model for hotspot occurrences using a decision tree method

    Get PDF
    Forest fires in Indonesia mostly occur because of errors or bad intentions. This work demonstrates the application of a decision tree algorithm, namely the C4.5 algorithm, to develop a classification model from forest fire data in the Rokan Hilir district, Indonesia. The classification model used is a collection of IF-THEN rules that can be used to predict hotspot occurrences for forest fires. The spatial data consist of the location of hotspot occurrences and human activity factors including the location of city centres, road and river networks as well as land cover types. The results were a decision tree containing 18 leaves and 26 nodes with an accuracy of 63.17%. Each leaf node holds positive and negative examples of hotspot occurrences whereas the root and internal nodes contain attribute test conditions: the distance from the location of examples to the nearest road, river, city centre and the land cover types for the area where the examples are located. Positive examples are hotspot locations in the study area and negative are randomly generated points within the area at least 1 km away from any positive example. The classification model categorized whether the region was susceptible to hotspots occurrences or not. The model can be used to predict hotspot occurrences in new locations for fire prediction

    A Decision Tree Based on Spatial Relationships for Predicting Hotspots in Peatlands

    Get PDF
    Predicting hotspot occurrence as an indicator of forest and land fires is essential in developing an early warning system for fire prevention.  This work applied a spatial decision tree algorithm on spatial data of forest fires. The algorithm is the improvement of the conventional decision tree algorithm in which the distance and topological relationships are included to grow up spatial decision trees. Spatial data consist of a target layer and ten explanatory layers representing physical, weather, socio-economic and peatland characteristics in the study area Rokan Hilir District, Indonesia. Target objects are hotspots of 2008 and non-hotspot points.  The result is a pruned spatial decision tree with 122 leaves and the accuracy of 71.66%.  The spatial tree has produces higher accuracy than the non-spatial trees that were created using the ID3 and C4.5 algorithm. The ID3 decision tree has accuracy of 49.02% while the accuracy of C4.5 decision tree reaches 65.24%

    Comparative Analysis of Spatial Decision Tree Algorithms for Burned Area of Peatland in Rokan Hilir Riau

    Get PDF
     Over one-year period (March 2013 – March 2014), 58 percent of all detected hotspots in Indonesia are found in Riau Province. According to the data, Rokan Hilir shared the greatest number of hotspots, about 75% hotspots alert occur in peatland areas. This study applied spatial decision tree algorithms to classify classes before burned, burned, and after burned from remote sensed data of peatland area in Kubu and Pasir Limau Kapas subdistrict, Rokan Hilir, Riau. The decision tree algorithm based on spatial autocorrelation is applied by involving Neigborhood Split Autocorrelation Ratio (NSAR) to the information gain of CART algorithm. This spatial decision tree classification method is compared to the conventional decision tree algorithms, namely, Classification and Regression Trees (CART),  C5.0, and C4.5 algorithm. The experimental results showed that the C5.0 algorithm generate the most accurate classifier with the accuracy of  99.79%. The implementation of spatial decision tree algorithm succesfuly improve the accuracy of CART algorithm

    Plataforma integrada de dados de acidentes de viação para suporte a processos de aprendizagem automática

    Get PDF
    Integrated road accident data platform to support machine learning techniques Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this dissertation is to identify these factors using accident data from 2016 to 2019 from the district of Setúbal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR) and naive bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Furthermore, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots; Sumário: Os acidentes de trânsito são uma grande preocupação a nível mundial, uma vez que resultam em grandes números de vítimas, feridos e mortes por ano, como também perdas económicas significativas. Existem vários fatores responsáveis por causar acidentes rodoviários. Se pudermos compreender e prever melhor estes fatores, talvez seja possível tomar medidas para mitigar os danos e a sua gravidade. O objetivo desta dissertação é identificar estes fatores utilizando dados de acidentes de 2016 a 2019 do distrito de Setúbal, Portugal. Este trabalho tem como objetivo desenvolver modelos capazes de selecionar um conjunto de fatores influentes e que possam vir a ser utilizados para classificar a gravidade de um acidente, suportando uma análise aos dados de acidentes. Além disso, este estudo também propõe um modelo de previsão para futuros acidentes rodoviários com base em dados do passado. Várias abordagens de aprendizagem automática são usadas para criar esses modelos. Métodos de aprendizagem supervisionada, como árvores de decisão (DT), random forest (RF), regressão logística (LR) e naive bayes (NB), são usados, bem como técnicas de aprendizagem automática não supervisionada, incluindo DBSCAN e clustering hierárquico. Os resultados mostram que um modelo baseado em regras usando o algoritmo C5.0 é capaz de detetar com precisão os fatores mais relevantes que descrevem a gravidade de um acidente de viação. Além disso, os resultados do modelo preditivo sugerem que o modelo RF pode ser uma ferramenta útil para a previsão de acidentes

    Predictive analytics applied to firefighter response, a practical approach

    Get PDF
    Time is a crucial factor for the outcome of emergencies, especially those that involve human lives. This paper looks at Lisbon’s firefighter’s occurrences and presents a model,based on city characteristics and climacteric data, to predict whether there will be an occurrence at a certain location, according to the weather forecasts. In this study three algorithms were considered, Logistic Regression, Decision Tree and Random Forest.Measured by the AUC, the best performant modelwasa random forestwith random under-sampling at 0.68. This model was well adjusted across the city and showed that precipitation and size of the subsection are themost relevant featuresin predicting firefighter’s occurrences.The work presented here has clear implications on the firefighter’s decision-makingregarding vehicle allocation, as now they can make an informed decision considering the predicted occurrences