385 research outputs found

    Symbolic and Visual Retrieval of Mathematical Notation using Formula Graph Symbol Pair Matching and Structural Alignment

    Get PDF
    Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. We propose a framework for retrieval of mathematical notation using symbol pairs extracted from visual and semantic representations of mathematical expressions on the symbolic domain for retrieval of text documents. We further adapt our model for retrieval of mathematical notation on images and lecture videos. Graph-based representations are used on each modality to describe math formulas. For symbolic formula retrieval, where the structure is known, we use symbol layout trees and operator trees. For image-based formula retrieval, since the structure is unknown we use a more general Line of Sight graph representation. Paths of these graphs define symbol pairs tuples that are used as the entries for our inverted index of mathematical notation. Our retrieval framework uses a three-stage approach with a fast selection of candidates as the first layer, a more detailed matching algorithm with similarity metric computation in the second stage, and finally when relevance assessments are available, we use an optional third layer with linear regression for estimation of relevance using multiple similarity scores for final re-ranking. Our model has been evaluated using large collections of documents, and preliminary results are presented for videos and cross-modal search. The proposed framework can be adapted for other domains like chemistry or technical diagrams where two visually similar elements from a collection are usually related to each other

    Indexing Temporal XML documents

    Get PDF

    Desirable properties for XML update mechanisms

    Get PDF
    The adoption of XML as the default data interchange format and the standardisation of the XPath and XQuery languages has resulted in significant research in the development and implementation of XML databases capable of processing queries efficiently. The ever-increasing deployment of XML in industry and the real-world requirement to support efficient updates to XML documents has more recently prompted research in dynamic XML labelling schemes. In this paper, we provide an overview of the recent research in dynamic XML labelling schemes. Our motivation is to define a set of properties that represent a more holistic dynamic labelling scheme and present our findings through an evaluation matrix for most of the existing schemes that provide update functionality

    SPST-Index : a self pruning splay tree index for database cracking

    Get PDF
    Orientador : Prof. Dr. Eduardo Cunha de AlmeidaDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 24/02/2017Inclui referências : f. 41-43Área de concentração: Ciência da computaçãoResumo: Em Database Cracking, uma coluna de banco de dados se organiza fisicamente, de maneira autônoma, em partições, um índice é então criado para otimizar o acesso a essas partições. A árvore AVL é a estrutura de dados utilizada para implementar esse índice. Contudo, em termos de cache, ela é particularmente ineficiente para consultas de intervalos, já que seus nós acessados apenas algumas vezes e os nós frequentemente acessados estão espalhados por toda a árvore. Esse trabalho apresenta a Self-Pruning Splay Tree (SPST) que é uma estrutura de dados capaz de reorganizar os dados mais e menos acessados, melhorando o tempo de acesso para as partições mais acessadas. Para cada consulta de intervalo, a SPST rotaciona para a raiz os nós que apontam para os valores do predicado da consulta e o valor médio do intervalo. Eventualmente, os nós mais acessados da árvore irão permanecer próximos a raíz, melhorando a utilização da CPU e a atividade de cache. Os nós menos acessados permanecerão próximos às folhas e serão removidos para limparmos dados que não são utilizados, diminuindo o tamanho do índice e obtendo custos de leitura e atualização menores. Palavras-chave: Database Cracking, Índice para Cracking , Árvore Splay.Abstract: In database cracking, a database is physically self-organized into cracked partitions with cracker indices boosting the access to these partitions. The AVL Tree is the current data structure of choice to implement cracker indices. However, it is particularly cache-inefficient for range queries, because the nodes accessed only for a few times (i.e, "Cold Data") and the most accessed ones (i.e, "Hot Data") are spread all over the index. This work presents the Self-Pruning Splay Tree (SPST) data structure to index database cracking and reorganize "Hot Data" and "Cold Data" to boost the access to the cracked partitions. To every range query, the SPST rotates to the root the nodes pointing to the edges and to the middle value of the predicate interval. Eventually, the most accessed tree nodes remain close to the root improving CPU and cache activity. On the other hand, the least accessed tree nodes remain close to the leaves and are pruned to clean up unused data in order to diminish the storage footprint with significant improvements: smaller lookup/update costs. Keywords: Database Cracking, Cracker Index, Splay Tree

    On Mobility Management in Multi-Sink Sensor Networks for Geocasting of Queries

    Get PDF
    In order to efficiently deal with location dependent messages in multi-sink wireless sensor networks (WSNs), it is key that the network informs sinks what geographical area is covered by which sink. The sinks are then able to efficiently route messages which are only valid in particular regions of the deployment. In our previous work (see the 5th and 6th cited documents), we proposed a combined coverage area reporting and geographical routing protocol for location dependent messages, for example, queries that are injected by sinks. In this paper, we study the case where we have static sinks and mobile sensor nodes in the network. To provide up-to-date coverage areas to sinks, we focus on handling node mobility in the network. We discuss what is a better method for updating the routing structure (i.e., routing trees and coverage areas) to handle mobility efficiently: periodic global updates initiated from sinks or local updates triggered by mobile sensors. Simulation results show that local updating perform very well in terms of query delivery ratio. Local updating has a better scalability to increasing network size. It is also more energy efficient than ourpreviously proposed approach, where global updating in networks have medium mobility rate and speed
    corecore