369 research outputs found

    Parallel Mining of Association Rules Using a Lattice Based Approach

    Get PDF
    The discovery of interesting patterns from database transactions is one of the major problems in knowledge discovery in database. One such interesting pattern is the association rules extracted from these transactions. Parallel algorithms are required for the mining of association rules due to the very large databases used to store the transactions. In this paper we present a parallel algorithm for the mining of association rules. We implemented a parallel algorithm that used a lattice approach for mining association rules. The Dynamic Distributed Rule Mining (DDRM) is a lattice-based algorithm that partitions the lattice into sublattices to be assigned to processors for processing and identification of frequent itemsets. Experimental results show that DDRM utilizes the processors efficiently and performed better than the prefix-based and partition algorithms that use a static approach to assign classes to the processors. The DDRM algorithm scales well and shows good speedup

    Incremental Processing and Optimization of Update Streams

    Get PDF
    Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications. Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks

    Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

    Get PDF
    This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms

    Managing distributed flexible manufacturing systems

    Get PDF
    Per molti anni la ricerca scientifica si è concentrata sui diversi aspetti di gestione dei sistemi manifatturieri, dall’ottimizzazione dei singoli processi produttivi, fino alla gestione delle più complesse imprese virtuali. Tuttavia molti aspetti inerenti il coordinamento e il controllo, ancora presentano problematiche rilevanti in ambito industriale e temi di ricerca aperti. L’applicazione di tecnologie avanzate e di strumenti informatici evoluti non riesce da sola a garantire il successo nelle funzioni di controllo e di integrazione. Al fine di ottenere un alto grado di efficienza, è necessario supportare tali tecnologie e strumenti con dei modelli che siano in grado di rappresentare le funzionalità e i processi dei sistemi manifatturieri, e consentano di prevederne e gestirne l’evoluzione. Ne emerge l’esigenza di politiche di controllo e di gestio ne distribuite, che favoriscano l’auto-organizzazione e la cooperazione nei sistemi manifatturieri. I sistemi manifatturieri flessibili distribuiti (DFMS), in risposta a tale esigenza, sono sistemi di produzione dinamici in grado di garantire una risposta in tempo reale alla allocazione ottima delle risorse, e organizzare efficientemente le lavorazioni dei prodotti. In questa tesi viene proposta una modellizzazione a livelli per tali sistemi. Secondo tale rappresentazione un DFMS può essere visto come un grafo strutturato su più livelli, tale che: i vertici del grafo rappresentano le risorse interagenti nel sistema; ogni nodo di un livello rappresenta a sua volta un livello successivo. Partendo da questa rappresentazione, sono stati quindi sviluppati due modelli per lo studio dell’allocazione ottima delle risorse (task mapping) e per l’organizzazione di lavorazioni (task scheduling) che richiedono l’uso simultaneo di risorse condivise nel sistema. Il task mapping problem consiste nella suddivisione bilanciata di un certo insieme di lavorazioni tra le risorse del sistema. In questa tesi si è studiato il caso in cui le lavorazioni sono omogenee, non presentano vincoli di precedenza, ma necessitano di un certo volume di comunicazione tra le risorse cui sono assegnate per garantirne il coordinamento, incidendo in tal senso sulla complessità di gestione. L’analisi critica dei modelli che sono tipicamente usati in letteratura per rappresentare tale problema, ne hanno posto in evidenza l’inadeguatezza. Attraverso alcuni risultati teorici si è quindi dimostrato come il problema possa ricondursi ad un hypergraph partitioning problem. Studiando la formulazione matematica di tali problemi, e limitandosi al caso di due risorse produttive, si è infine giunti alla determinazione di una buona approssimazione sulla soluzione ottima. Il problema di sequenziamento delle lavorazioni (task scheduling) che richiedono l’uso simultaneo di risorse condivise è stato trattato nel caso specifico di celle robotizzate. E’ stata quindi dimostrata l’NP-completezza di questo problema ed è stata progettata una euristica di soluzione, validandone i risultati in diversi scenari produttivi.For several years, research has focused on several aspects of manufacturing, from the individual processes towards the management of virtual enterprises, but several aspects, like coordination and control, still have relevant problems in industry and remain challenging areas of research. The application of advanced technologies and informational tools by itself does not guarantee the success of control and integration applications. In order to get a high degree of integration and efficiency, it is necessary to match the technologies and tools with models that describe the existing knowledge and functionality in the system and allow the correct understanding of its behaviour. In a global and wide market competition, the manufacturing systems present requirements that lead to distributed, self-organised, co-operative and heterogeneous control applications. A Distributed Flexible Manufacturing System (DFMS) is a goal-driven and data-directed dynamic system which is designed to provide an effective operation sequence for the products to fulfil the production goals, to meet real-time requirements and to optimally allocate resources. In this work first a layered approach for modeling such production systems is proposed. According to that representation, a DFMS may be seen as multi-layer resource-graph such that: vertices on a layer represent interacting resources; a layer at level l is represented by a node in the layer at level (l-1). Then two models are developed concerning with two relevant managerial issues in DFMS, the task mapping problem and the task scheduling with multiple shared resources problem. The task mapping problem concerns with the balanced partition of a given set of jobs and the assignment of the parts to the resources of the manufacturing system. We study the case in which the jobs are quite homogeneous, do not have precedence constraints, but need some communications to be coordinated. So, jobs assignment to different parts causes a relevant communication effort between those parts, increasing the managerial complexity. We show that the standard models usually used to formal represent such a problem are wrong. Through some graph theoretical results we relate the problem to the well-known hypergraph partitioning problem and briefly survey the best techniques to solve the problem. A new formulation of the problem is then presented. Some considerations on an improved version of the formulation permit the computation of a good Lower Bound on the optimal solution in the case of the hypergraph bisection. The task scheduling with multiple shared resources problem is addressed for a robotic cell. We study the general problem of sequencing multiple jobs, where each job consists of multiple ordered tasks and tasks execution requires simultaneous usage of several resources. NP-completeness results are given. A heuristic with a guarantee approximation result is designed and evaluated

    Fifth Biennial Report : June 1999 - August 2001

    No full text
    • …
    corecore