25,110 research outputs found

    Dynamic load balancing for the distributed mining of molecular structures

    Get PDF
    In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids

    Tupleware: Redefining Modern Analytics

    Full text link
    There is a fundamental discrepancy between the targeted and actual users of current analytics frameworks. Most systems are designed for the data and infrastructure of the Googles and Facebooks of the world---petabytes of data distributed across large cloud deployments consisting of thousands of cheap commodity machines. Yet, the vast majority of users operate clusters ranging from a few to a few dozen nodes, analyze relatively small datasets of up to a few terabytes, and perform primarily compute-intensive operations. Targeting these users fundamentally changes the way we should build analytics systems. This paper describes the design of Tupleware, a new system specifically aimed at the challenges faced by the typical user. Tupleware's architecture brings together ideas from the database, compiler, and programming languages communities to create a powerful end-to-end solution for data analysis. We propose novel techniques that consider the data, computations, and hardware together to achieve maximum performance on a case-by-case basis. Our experimental evaluation quantifies the impact of our novel techniques and shows orders of magnitude performance improvement over alternative systems

    Control-data separation architecture for cellular radio access networks: a survey and outlook

    Get PDF
    Conventional cellular systems are designed to ensure ubiquitous coverage with an always present wireless channel irrespective of the spatial and temporal demand of service. This approach raises several problems due to the tight coupling between network and data access points, as well as the paradigm shift towards data-oriented services, heterogeneous deployments and network densification. A logical separation between control and data planes is seen as a promising solution that could overcome these issues, by providing data services under the umbrella of a coverage layer. This article presents a holistic survey of existing literature on the control-data separation architecture (CDSA) for cellular radio access networks. As a starting point, we discuss the fundamentals, concepts, and general structure of the CDSA. Then, we point out limitations of the conventional architecture in futuristic deployment scenarios. In addition, we present and critically discuss the work that has been done to investigate potential benefits of the CDSA, as well as its technical challenges and enabling technologies. Finally, an overview of standardisation proposals related to this research vision is provided
    corecore