125,484 research outputs found

    Non-linear Pattern Matching with Backtracking for Non-free Data Types

    Full text link
    Non-free data types are data types whose data have no canonical forms. For example, multisets are non-free data types because the multiset {a,b,b}\{a,b,b\} has two other equivalent but literally different forms {b,a,b}\{b,a,b\} and {b,b,a}\{b,b,a\}. Pattern matching is known to provide a handy tool set to treat such data types. Although many studies on pattern matching and implementations for practical programming languages have been proposed so far, we observe that none of these studies satisfy all the criteria of practical pattern matching, which are as follows: i) efficiency of the backtracking algorithm for non-linear patterns, ii) extensibility of matching process, and iii) polymorphism in patterns. This paper aims to design a new pattern-matching-oriented programming language that satisfies all the above three criteria. The proposed language features clean Scheme-like syntax and efficient and extensible pattern matching semantics. This programming language is especially useful for the processing of complex non-free data types that not only include multisets and sets but also graphs and symbolic mathematical expressions. We discuss the importance of our criteria of practical pattern matching and how our language design naturally arises from the criteria. The proposed language has been already implemented and open-sourced as the Egison programming language

    Toward Entity-Aware Search

    Get PDF
    As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability

    Optimization of Spatial Joins Using Filters

    Get PDF
    When viewing present-day technical applications that rely on the use of database systems, one notices that new techniques must be integrated in database management systems to be able to support these applications efficiently. This paper discusses one of these techniques in the context of supporting a Geographic Information System. It is known that the use of filters on geometric objects has a significant impact on the processing of 2-way spatial join queries. For this purpose, filters require approximations of objects. Queries can be optimized by filtering data not with just one but with several filters. Existing join methods are based on a combination of filters and a spatial index. The index is used to reduce the cost of the filter step and to minimize the cost of retrieving geometric objects from disk. In this paper we examine n-way spatial joins. Complex n-way spatial join queries require solving several 2-way joins of intermediate results. In this case, not only the profit gained from using both filters and spatial indices but also the additional cost due to using these techniques are examined. For 2-way joins of base relations these costs are considered part of physical database design. We focus on the criteria for mutually comparing filters and not on those for spatial indices. Important aspects of a multi-step filter-based n-way spatial join method are described together with performance experiments. The winning join method uses several filters with approximations that are constructed by rotating two parallel lines around the object

    Distributed Time-Frequency Division Multiple Access Protocol For Wireless Sensor Networks

    Get PDF
    It is well known that biology-inspired self-maintaining algorithms in wireless sensor nodes achieve near optimum time division multiple access (TDMA) characteristics in a decentralized manner and with very low complexity. We extend such distributed TDMA approaches to multiple channels (frequencies). This is achieved by extending the concept of collaborative reactive listening in order to balance the number of nodes in all available channels. We prove the stability of the new protocol and estimate the delay until the balanced system state is reached. Our approach is benchmarked against single-channel distributed TDMA and channel hopping approaches using TinyOS imote2 wireless sensors.Comment: 4 pages, IEEE Wireless Communications Letters, to appear in 201

    Implementing TontineCoin

    Get PDF
    One of the alternatives to proof-of-work (PoW) consensus protocols is proof-of- stake (PoS) protocols, which address its energy and cost related issues. But they suffer from the nothing-at-stake problem; validators (PoS miners) are bound to lose nothing if they support multiple blockchain forks. Tendermint, a PoS protocol, handles this problem by forcing validators to bond their stake and then seizing a cheater’s stake when caught signing multiple competing blocks. The seized stake is then evenly distributed amongst the rest of validators. However, as the number of validators increases, the benefit in finding a cheater compared to the cost of monitoring validators reduces, weakening the system’s defense against the problem. Previous work on TontineCoin addresses this problem by utilizing the concept of tontines. A tontine is an investment scheme in which each participant receives a portion of benefits based on their share. As the number of participants in a tontine decreases, individual benefit increases, which acts as a motivation for participants to eliminate each other. Utilizing this feature in TontineCoin ensures that validators (participants of a tontine) are highly motivated to monitor each other, thus strengthening the system against the nothing-at-stake problem. This project implements a prototype of Tendermint using the Spartan Gold codebase and develops TontineCoin based on it. This implementation is the first implementation of the protocol, and simulates and contrasts five different normal operations in both the Tendermint and TontineCoin models. It also simulates and discusses how a nothing-at-stake attack is handled in TontineCoin compared to Tendermint

    MCMC with Strings and Branes: The Suburban Algorithm (Extended Version)

    Get PDF
    Motivated by the physics of strings and branes, we develop a class of Markov chain Monte Carlo (MCMC) algorithms involving extended objects. Starting from a collection of parallel Metropolis-Hastings (MH) samplers, we place them on an auxiliary grid, and couple them together via nearest neighbor interactions. This leads to a class of "suburban samplers" (i.e., spread out Metropolis). Coupling the samplers in this way modifies the mixing rate and speed of convergence for the Markov chain, and can in many cases allow a sampler to more easily overcome free energy barriers in a target distribution. We test these general theoretical considerations by performing several numerical experiments. For suburban samplers with a fluctuating grid topology, performance is strongly correlated with the average number of neighbors. Increasing the average number of neighbors above zero initially leads to an increase in performance, though there is a critical connectivity with effective dimension d_eff ~ 1, above which "groupthink" takes over, and the performance of the sampler declines.Comment: v2: 55 pages, 13 figures, references and clarifications added. Published version. This article is an extended version of "MCMC with Strings and Branes: The Suburban Algorithm
    • …
    corecore