17 research outputs found
Parallel In-Memory Evaluation of Spatial Joins
The spatial join is a popular operation in spatial database systems and its
evaluation is a well-studied problem. As main memories become bigger and faster
and commodity hardware supports parallel processing, there is a need to revamp
classic join algorithms which have been designed for I/O-bound processing. In
view of this, we study the in-memory and parallel evaluation of spatial joins,
by re-designing a classic partitioning-based algorithm to consider alternative
approaches for space partitioning. Our study shows that, compared to a
straightforward implementation of the algorithm, our tuning can improve
performance significantly. We also show how to select appropriate partitioning
parameters based on data statistics, in order to tune the algorithm for the
given join inputs. Our parallel implementation scales gracefully with the
number of threads reducing the cost of the join to at most one second even for
join inputs with tens of millions of rectangles.Comment: Extended version of the SIGSPATIAL'19 paper under the same titl
SMoT+: Extending the SMoT Algorithm for Discovering Stops in Nested Sites
Several methods have been proposed to analyse trajectory data. However, a few of these methods consider trajectory relations with relevant features of the geographic space. One of the best-known methods that take into account the geographical regions crossed by a trajectory is the SMoT algorithm. Nevertheless, SMoT considers only disjoint geographic regions that a trajectory may traverse, while many regions of interest are contained in other regions. In this article, we extend the SMoT algorithm for discovering stops in nested regions. The proposed algorithm, called SMoT+, takes advantage of information about the hierarchy of nested regions to efficiently discover the stops in regions at different levels of this hierarchy. Experiments with real data show that SMoT+ detects stops in nested regions, which are not detected by the original SMoT algorithm, with minor growth of processing time
Periodic Pattern Mining a Algorithms and Applications
Owing to a large number of applications periodic pattern mining has been extensively studied for over a decade Periodic pattern is a pattern that repeats itself with a specific period in a give sequence Periodic patterns can be mined from datasets like biological sequences continuous and discrete time series data spatiotemporal data and social networks Periodic patterns are classified based on different criteria Periodic patterns are categorized as frequent periodic patterns and statistically significant patterns based on the frequency of occurrence Frequent periodic patterns are in turn classified as perfect and imperfect periodic patterns full and partial periodic patterns synchronous and asynchronous periodic patterns dense periodic patterns approximate periodic patterns This paper presents a survey of the state of art research on periodic pattern mining algorithms and their application areas A discussion of merits and demerits of these algorithms was given The paper also presents a brief overview of algorithms that can be applied for specific types of datasets like spatiotemporal data and social network
Trajectory-Based Spatiotemporal Entity Linking
Trajectory-based spatiotemporal entity linking is to match the same moving
object in different datasets based on their movement traces. It is a
fundamental step to support spatiotemporal data integration and analysis. In
this paper, we study the problem of spatiotemporal entity linking using
effective and concise signatures extracted from their trajectories. This
linking problem is formalized as a k-nearest neighbor (k-NN) query on the
signatures. Four representation strategies (sequential, temporal, spatial, and
spatiotemporal) and two quantitative criteria (commonality and unicity) are
investigated for signature construction. A simple yet effective dimension
reduction strategy is developed together with a novel indexing structure called
the WR-tree to speed up the search. A number of optimization methods are
proposed to improve the accuracy and robustness of the linking. Our extensive
experiments on real-world datasets verify the superiority of our approach over
the state-of-the-art solutions in terms of both accuracy and efficiency.Comment: 15 pages, 3 figures, 15 table
T-profiles: a method for inferring socio-demographic profiles from trajectories
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2015.Ter o conhecimento sobre o perfil dos habitantes de uma cidade ou paÃs tem grande valor para administrações públicas e empresas. Conhecer o perfil de uma população pode auxiliar o trabalho de planejadores urbanos, administradores de transporte público, serviços governamentais ou empresas de diferentes maneiras como, por exemplo, decidir onde é interessante instalar uma nova loja ou personalizar anúncios para um determinado público. A forma mais comum utilizada na análise de informações demográficas de uma população é através da segmentação da mesma em perfis sócio-demográficos, como idade, ocupação, estado civil ou renda mensal. Atualmente, para que essas informações sejam descobertas e analisadas, os dados são coletados através de entrevistas realizadas de casa em casa, periodicamente, em diversos paÃses. No entanto, este tipo de abordagem possui algumas desvantagens: 1) os dados não são atualizados e precisos, pois são coletados em um intervalo de 5 - 10 anos; 2) a coleta é muito custosa e cobre apenas uma parcela da população por um curto perÃodo de tempo, apesar de ser estatisticamente significante; 3) não caracteriza as atividades completas do indivÃduo, apenas o perÃodo de 1 dia de atividades, fornecidas através da entrevista realizada. Atualmente, é possÃvel inferir muito conhecimento a partir do comportamento das pessoas analisando seu movimento do dia-a-dia, uma vez que grandes quantidades de dados de movimento estão disponÃveis como: dados de telefone celular, redes sociais, dados de GPS, etc. Nesta dissertação, é proposto um método para a extração de perfis sócio-demográficos a partir de trajetórias de objetos móveis, e apresenta as seguintes contribuições: (i) proposta de um modelo de perfil geral para representar o perfil sócio-demográfico de pessoas, como trabalhador, estudante, desempregado, etc; (ii) proposta de um modelo para representar o histórico de movimentação diária dos indivÃduos; (iii) proposta de funções de similaridade para fazer o casamento entre histórico e modelo de perfil e; (iv) um algoritmo chamado T-Profiles que realiza a comparação entre modelo de perfil e modelo de histórico, com o intuito de inferir o perfil sócio-demográfico de um objeto móvel a partir de sua trajetória. O algoritmo T-Profiles é validado utilizando dados reais de trajetórias, obtendo em torno de 90% de precisão.Abstract : The knowledge about people living in a city or country has great value for the public administration as well as for enterprises. To know the population profile may help the job of smart city planners, public transportation administrators, government services or companies in many different ways, such as to decide if and where to install a new store or to personalize an advertisement, for example. The usual approach for population demographic analysis is to segment the population in socio-demographic profiles, such as age, occupation, marital status or income. Most attempts to discover and measure the population profiles is through human surveys, and the most well-known example is the socio-demographic census with diary activities, done periodically in many countries. However, the main drawbacks of the census data is that they: 1) are not up to date since they are usually collected every 5 - 10 years; 2) are expensive to collect, and cover only a small - although statistically significant - part of the population for a short period of time; 3) do not collect the actual movement of the individuals, but only the activity performed during one day and which is mentioned by the user during the interview. We believe that nowadays we can infer much knowledge and the real behavior about people from their every day movement. In this thesis we propose a method to extract socio-demographic profiles from trajectories of moving objects, and make the following contributions: (i) we propose a general profile model to represent socio-demographic profiles of people such as worker, student, unemployed, etc; (ii) we propose a moving object history model to represent the daily movement of the object, and (iii) we propose similarity functions and an algorithm called T-Profiles for matching the profile model and the history model in order to infer the socio-demographic profile of a moving object from his/her trajectories. We validate T-Profiles with real trajectory data obtaining about 90% of precision
Predictive Modeling of Fuel Efficiency of Trucks
This research studied the behavior of several controllable variables that affect the fuel efficiency of trucks. Re-routing is the process of modifying the parameters of the routes for a set of trips to optimize fuel consumption and also to increase customer satisfaction through efficient deliveries. This is an important process undertaken by a food distribution company to modify the trips to adapt to the immediate necessities. A predictive model was developed to calculate the change in Miles per Gallon (MPG) whenever a re-route is performed on a region of a particular distribution area. The data that was used, was from the Dallas center which is one of the distribution centers owned by the company. A consistent model that could provide relatively accurate predictions across five distribution centers had to be developed. It was found that the model built using the data from the Corporate center was the most consistent one. The timeline of the data used to build the model was from May 2013 through December 2013. The predictive model provided predictions of which about 88% of the data that was used, was within the 0-10% error group. This was an improvement on the lesser 43% obtained for the linear regression and K-means clustering models. The model was also validated on the data for January 2014 through the first two weeks of March 2014 and it provided predictions of which about 81% of the data was within the 0-10 % error group. The average overall error was around 10%, which was the least for the approaches explored in this research. Weight, stop count and stop time were identified as the most significant factors which influence the fuel efficiency of the trucks. Further, neural network architecture was built to improve the predictions of the MPG. The model can be used to predict the average change in MPG for a set of trips whenever a re-route is performed. Since the aim of re-routing is to reduce the miles and trips; extra load will be added to the remaining trips. Although, the MPG would decrease because of this extra load, it would be offset by the savings due to the drop in miles and trips. The net savings in the fuel can now be translated into the amount of money saved