Search CORE

17,475 research outputs found

A customizable multi-agent system for distributed data mining

Author: Di Fatta Giuseppe
Fortino Giancarlo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances

CiteSeerX

High performance subgraph mining in molecular compounds

Author: M.J. Zaki
O. Weislow
R. Finkel
T. Washio
Y. Chung
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations

Efficient mining of discriminative molecular fragments

Author: Berthold Michael R.
Di Fatta Giuseppe
Publication venue
Publication date: 01/01/2005
Field of study

Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset

Perfect tag identification protocol in RFID networks

Author: Bonuccelli Maurizio A.
Lonetti Francesca
Martelli Francesca
Publication venue
Publication date: 13/05/2008
Field of study

Radio Frequency IDentification (RFID) systems are becoming more and more popular in the field of ubiquitous computing, in particular for objects identification. An RFID system is composed by one or more readers and a number of tags. One of the main issues in an RFID network is the fast and reliable identification of all tags in the reader range. The reader issues some queries, and tags properly answer. Then, the reader must identify the tags from such answers. This is crucial for most applications. Since the transmission medium is shared, the typical problem to be faced is a MAC-like one, i.e. to avoid or limit the number of tags transmission collisions. We propose a protocol which, under some assumptions about transmission techniques, always achieves a 100% perfomance. It is based on a proper recursive splitting of the concurrent tags sets, until all tags have been identified. The other approaches present in literature have performances of about 42% in the average at most. The counterpart is a more sophisticated hardware to be deployed in the manufacture of low cost tags.Comment: 12 pages, 1 figur

arXiv.org e-Print Archive