5,335 research outputs found

    Investigating the memory requirements for publish/subscribe filtering algorithms

    Get PDF
    Various filtering algorithms for publish/subscribe systems have been proposed. One distinguishing characteristic is their internal representation of Boolean subscriptions: They either require conversions to disjunctive normal forms (canonical approaches) or are directly exploited in event filtering (non-canonical approaches). In this paper, we present a detailed analysis and comparison of the memory requirements of canonical and non-canonical filtering algorithms. This includes a theoretical analysis of space usages as well as a verification of our theoretical results by an evaluation of a practical implementation. This practical analysis also considers time (filter) efficiency, which is the other important quality measure of filtering algorithms. By correlating the results of space and time efficiency, we conclude when to use non-canonical and canonical approaches

    Arbitrary boolean advertisements: the final step in supporting the boolean publish/subscribe model

    Get PDF
    Publish/subscribe systems allow for an efficient filtering of incoming information. This filtering is based on the specifications of subscriber interests, which are registered with the system as subscriptions. Publishers conversely specify advertisements, describing the messages they will send later on. What is missing so far is the support of arbitrary Boolean advertisements in publish/subscribe systems. Introducing the opportunity to specify these richer Boolean advertisements increases the accuracy of publishers to state their future messages compared to currently supported conjunctive advertisements. Thus, the amount of subscriptions forwarded in the network is reduced. Additionally, the system can more time efficiently decide whether a subscription needs to be forwarded and more space efficiently store and index advertisements. In this paper, we introduce a publish/subscribe system that supports arbitrary Boolean advertisements and, symmetrically, arbitrary Boolean subscriptions. We show the advantages of supporting arbitrary Boolean advertisements and present an algorithm to calculate the practically required overlapping relationship among subscriptions and advertisements. Additionally, we develop the first optimization approach for arbitrary Boolean advertisements, advertisement pruning. Advertisement pruning is tailored to optimize advertisements, which is a strong contrast to current optimizations for conjunctive advertisements. These recent proposals mainly apply subscription-based optimization ideas, which is leading to the same disadvantages. In the second part of this paper, our evaluation of practical experiments, we analyze the efficiency properties of our approach to determine the overlapping relationship. We also compare conjunctive solutions for the overlapping problem to our calculation algorithm to show its benefits. Finally, we present a detailed evaluation of the optimization potential of advertisement pruning. This includes the analysis of the effects of additionally optimizing subscriptions on the advertisement pruning optimization

    A Taxonomy of Information Retrieval Models and Tools

    Get PDF
    Information retrieval is attracting significant attention due to the exponential growth of the amount of information available in digital format. The proliferation of information retrieval objects, including algorithms, methods, technologies, and tools, makes it difficult to assess their capabilities and features and to understand the relationships that exist among them. In addition, the terminology is often confusing and misleading, as different terms are used to denote the same, or similar, tasks. This paper proposes a taxonomy of information retrieval models and tools and provides precise definitions for the key terms. The taxonomy consists of superimposing two views: a vertical taxonomy, that classifies IR models with respect to a set of basic features, and a horizontal taxonomy, which classifies IR systems and services with respect to the tasks they support. The aim is to provide a framework for classifying existing information retrieval models and tools and a solid point to assess future developments in the field

    Assured information sharing for ad-hoc collaboration

    Get PDF
    Collaborative information sharing tends to be highly dynamic and often ad hoc among organizations. The dynamic natures and sharing patterns in ad-hoc collaboration impose a need for a comprehensive and flexible approach to reflecting and coping with the unique access control requirements associated with the environment. This dissertation outlines a Role-based Access Management for Ad-hoc Resource Shar- ing framework (RAMARS) to enable secure and selective information sharing in the het- erogeneous ad-hoc collaborative environment. Our framework incorporates a role-based approach to addressing originator control, delegation and dissemination control. A special trust-aware feature is incorporated to deal with dynamic user and trust management, and a novel resource modeling scheme is proposed to support fine-grained selective sharing of composite data. As a policy-driven approach, we formally specify the necessary pol- icy components in our framework and develop access control policies using standardized eXtensible Access Control Markup Language (XACML). The feasibility of our approach is evaluated in two emerging collaborative information sharing infrastructures: peer-to- peer networking (P2P) and Grid computing. As a potential application domain, RAMARS framework is further extended and adopted in secure healthcare services, with a unified patient-centric access control scheme being proposed to enable selective and authorized sharing of Electronic Health Records (EHRs), accommodating various privacy protection requirements at different levels of granularity

    Confidentiality-Preserving Publish/Subscribe: A Survey

    Full text link
    Publish/subscribe (pub/sub) is an attractive communication paradigm for large-scale distributed applications running across multiple administrative domains. Pub/sub allows event-based information dissemination based on constraints on the nature of the data rather than on pre-established communication channels. It is a natural fit for deployment in untrusted environments such as public clouds linking applications across multiple sites. However, pub/sub in untrusted environments lead to major confidentiality concerns stemming from the content-centric nature of the communications. This survey classifies and analyzes different approaches to confidentiality preservation for pub/sub, from applications of trust and access control models to novel encryption techniques. It provides an overview of the current challenges posed by confidentiality concerns and points to future research directions in this promising field

    Partition Information and its Transmission over Boolean Multi-Access Channels

    Full text link
    In this paper, we propose a novel partition reservation system to study the partition information and its transmission over a noise-free Boolean multi-access channel. The objective of transmission is not message restoration, but to partition active users into distinct groups so that they can, subsequently, transmit their messages without collision. We first calculate (by mutual information) the amount of information needed for the partitioning without channel effects, and then propose two different coding schemes to obtain achievable transmission rates over the channel. The first one is the brute force method, where the codebook design is based on centralized source coding; the second method uses random coding where the codebook is generated randomly and optimal Bayesian decoding is employed to reconstruct the partition. Both methods shed light on the internal structure of the partition problem. A novel hypergraph formulation is proposed for the random coding scheme, which intuitively describes the information in terms of a strong coloring of a hypergraph induced by a sequence of channel operations and interactions between active users. An extended Fibonacci structure is found for a simple, but non-trivial, case with two active users. A comparison between these methods and group testing is conducted to demonstrate the uniqueness of our problem.Comment: Submitted to IEEE Transactions on Information Theory, major revisio

    LibraRing: An Architecture for Distributed Digital Libraries Based on DHTs

    Full text link
    Abstract. We present a digital library architecture based on distributed hash tables. We discuss the main components of this architecture and the protocols for offering information retrieval and information filtering functionality. We present an experimental evaluation of our proposals.
    corecore