17,475 research outputs found

    A customizable multi-agent system for distributed data mining

    Get PDF
    We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances

    High performance subgraph mining in molecular compounds

    Get PDF
    Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations

    Efficient mining of discriminative molecular fragments

    Get PDF
    Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset

    Perfect tag identification protocol in RFID networks

    Full text link
    Radio Frequency IDentification (RFID) systems are becoming more and more popular in the field of ubiquitous computing, in particular for objects identification. An RFID system is composed by one or more readers and a number of tags. One of the main issues in an RFID network is the fast and reliable identification of all tags in the reader range. The reader issues some queries, and tags properly answer. Then, the reader must identify the tags from such answers. This is crucial for most applications. Since the transmission medium is shared, the typical problem to be faced is a MAC-like one, i.e. to avoid or limit the number of tags transmission collisions. We propose a protocol which, under some assumptions about transmission techniques, always achieves a 100% perfomance. It is based on a proper recursive splitting of the concurrent tags sets, until all tags have been identified. The other approaches present in literature have performances of about 42% in the average at most. The counterpart is a more sophisticated hardware to be deployed in the manufacture of low cost tags.Comment: 12 pages, 1 figur
    • …
    corecore