4,141 research outputs found

    Knowledge-centric Analytics Queries Allocation in Edge Computing Environments

    Get PDF
    The Internet of Things involves a huge number of devices that collect data and deliver them to the Cloud. The processing of data at the Cloud is characterized by increased latency in providing responses to analytics queries defined by analysts or applications. Hence, Edge Computing (EC) comes into the scene to provide data processing close to the source. The collected data can be stored in edge devices and queries can be executed there to reduce latency. In this paper, we envision a case where entities located in the Cloud undertake the responsibility of receiving analytics queries and decide on the most appropriate edge nodes for queries execution. The decision is based on statistical signatures of the datasets of nodes and the statistical matching between statistics and analytics queries. Edge nodes regularly update their statistical signatures to support such decision process. Our performance evaluation shows the advantages and the shortcomings of our proposed schema in edge computing environments

    GraphX: Unifying Data-Parallel and Graph-Parallel Analytics

    Full text link
    From social networks to language modeling, the growing scale and importance of graph data has driven the development of numerous new graph-parallel systems (e.g., Pregel, GraphLab). By restricting the computation that can be expressed and introducing new techniques to partition and distribute the graph, these systems can efficiently execute iterative graph algorithms orders of magnitude faster than more general data-parallel systems. However, the same restrictions that enable the performance gains also make it difficult to express many of the important stages in a typical graph-analytics pipeline: constructing the graph, modifying its structure, or expressing computation that spans multiple graphs. As a consequence, existing graph analytics pipelines compose graph-parallel and data-parallel systems using external storage systems, leading to extensive data movement and complicated programming model. To address these challenges we introduce GraphX, a distributed graph computation framework that unifies graph-parallel and data-parallel computation. GraphX provides a small, core set of graph-parallel operators expressive enough to implement the Pregel and PowerGraph abstractions, yet simple enough to be cast in relational algebra. GraphX uses a collection of query optimization techniques such as automatic join rewrites to efficiently implement these graph-parallel operators. We evaluate GraphX on real-world graphs and workloads and demonstrate that GraphX achieves comparable performance as specialized graph computation systems, while outperforming them in end-to-end graph pipelines. Moreover, GraphX achieves a balance between expressiveness, performance, and ease of use
    • …
    corecore