18 research outputs found

    On Semantic Caching and Query Scheduling for Mobile Nearest-Neighbor Search

    Get PDF
    Location-based services have received increasing attention in recent years. In this paper, we address the performance issues of mobile nearest-neighbor search, in which the mobile user issues a query to retrieve stationary service objects nearest to him/her. An index based on Voronoi Diagram is used in the server to support such a search, while a semantic cache is proposed to enhance the access e ciency of the service. Cache replacement policies tailored for the proposed semantic cache are examined. Moreover, several query scheduling policies are proposed to address the inter-cell roaming issues in multi-cell environments. Simulations are conducted to evaluate the proposed methods. The result shows that the system performance, in terms of cache hit ratio, query response time, cell-cross number and cell-recross number, is improved signi cantly

    UV-Diagram: A Voronoi Diagram for Uncertain Spatial Databases

    Get PDF
    published_or_final_versio

    Algoritmos de machine learning aplicados em edifícios inteligentes com elevada penetração de veículos elétricos

    Get PDF
    A presente dissertação discute o desenvolvimento de um método de previsão de ocupação para dois parques de estacionamentos residenciais no contexto de um edifício inteligente, a fim de se conhecer, antecipadamente, qual a taxa de ocupação desses parques de estacionamentos. Para concretizar tal objetivo, utilizaram-se dados históricos realísticos coletados por observação empírica e extrapolado para um ano. O modelo de previsão desenvolvido utiliza técnicas de machine learning com diversos algoritmos testados, entre eles, Decision Tree, Extra Tree, Logistic Regression, Random Forest, Naive Bayes, K-Nearest Neighbors e Support Vector Machine. No modelo proposto foi identificado qual destes algoritmos obteve melhor desempenho. Vários tipos de modelos foram testados com o objetivo de melhorar os resultados obtidos, bem como compreender o impacto de cada um dos tratamentos dos dados utilizados. A solução final teve seu desempenho validado, com métricas de avaliação com bons resultados, exatidão e precisão superiores a 80%, e se mostrou eficaz considerando os dados analisados e ainda o horizonte temporal da previsão.This dissertation is focused on the development of a prediction method for two residential car parks in the context of an intelligent building. The aim was to know in advance the occupancy rate of these car parks, using only historical data collected by empirical observation, and extrapolate for one year. The prediction model developed uses machine learning techniques with several tested algorithms (Decision Tree, Extra Tree, Logistic Regression, Random Forest, Naive Bayes, K- Nearest Neighbors and Support Vector Machine) to identify which of these algorithms performs better. Several types of models were tested with the objective of improve the results obtained, and understand the impact of each of the data treatments used. The final solution had its performance validated, with good evaluation metrics results. Accuracy and precision were higher than 80% and, therefore, the solution proved to be effective considering the data analyzed and the temporal horizon of the forecast

    High-dimensional indexing methods utilizing clustering and dimensionality reduction

    Get PDF
    The emergence of novel database applications has resulted in the prevalence of a new paradigm for similarity search. These applications include multimedia databases, medical imaging databases, time series databases, DNA and protein sequence databases, and many others. Features of data objects are extracted and transformed into high-dimensional data points. Searching for objects becomes a search on points in the high-dimensional feature space. The dissimilarity between two objects is determined by the distance between two feature vectors. Similarity search is usually implemented as nearest neighbor search in feature vector spaces. The cost of processing k-nearest neighbor (k-NN) queries via a sequential scan increases as the number of objects and the number of features increase. A variety of multi-dimensional index structures have been proposed to improve the efficiency of k-NN query processing, which work well in low-dimensional space but lose their efficiency in high-dimensional space due to the curse of dimensionality. This inefficiency is dealt in this study by Clustering and Singular Value Decomposition - CSVD with indexing, Persistent Main Memory - PMM index, and Stepwise Dimensionality Increasing - SDI-tree index. CSVD is an approximate nearest neighbor search method. The performance of CSVD with indexing is studied and the approximation to the distance in original space is investigated. For a given Normalized Mean Square Error - NMSE, the higher the degree of clustering, the higher the recall. However, more clusters require more disk page accesses. Certain number of clusters can be obtained to achieve a higher recall while maintaining a relatively lower query processing cost. Clustering and Indexing using Persistent Main Memory - CIPMM framework is motivated by the following consideration: (a) a significant fraction of index pages are accessed randomly, incurring a high positioning time for each access; (b) disk transfer rate is improving 40% annually, while the improvement in positioning time is only 8%; (c) query processing incurs less CPU time for main memory resident than disk resident indices. CIPMM aims at reducing the elapsed time for query processing by utilizing sequential, rather than random disk accesses. A specific instance of the CIPMM framework CIPOP, indexing using Persistent Ordered Partition - OP-tree, is elaborated and compared with clustering and indexing using the SR-tree, CISR. The results show that CIPOP outperforms CISR, and the higher the dimensionality, the higher the performance gains. The SDI-tree index is motivated by fanouts decrease with dimensionality increasing and shorter vectors reduce cache misses. The index is built by using feature vectors transformed via principal component analysis, resulting in a structure with fewer dimensions at higher levels and increasing the number of dimensions from one level to the other. Dimensions are retained in nonincreasing order of their variance according to a parameter p, which specifies the incremental fraction of variance at each level of the index. Experiments on three datasets have shown that SDL-trees with carefully tuned parameters access fewer disk accesses than SR-trees and VAMSR-trees and incur less CPU time than VA-Files in addition
    corecore