81,172 research outputs found

    Algorithmic patterns for H\mathcal{H}-matrices on many-core processors

    Get PDF
    In this work, we consider the reformulation of hierarchical (H\mathcal{H}) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs). H\mathcal{H} matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of H\mathcal{H} matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing H\mathcal{H} matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full H\mathcal{H} matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source H\mathcal{H} matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard H\mathcal{H} matrix library, highlighting profound speedups of our many-core parallel approach

    An Enhanced Source Location Privacy based on Data Dissemination in Wireless Sensor Networks (DeLP)

    Get PDF
    open access articleWireless Sensor Network is a network of large number of nodes with limited power and computational capabilities. It has the potential of event monitoring in unattended locations where there is a chance of unauthorized access. The work that is presented here identifies and addresses the problem of eavesdropping in the exposed environment of the sensor network, which makes it easy for the adversary to trace the packets to find the originator source node, hence compromising the contextual privacy. Our scheme provides an enhanced three-level security system for source location privacy. The base station is at the center of square grid of four quadrants and it is surrounded by a ring of flooding nodes, which act as a first step in confusing the adversary. The fake node is deployed in the opposite quadrant of actual source and start reporting base station. The selection of phantom node using our algorithm in another quadrant provides the third level of confusion. The results show that Dissemination in Wireless Sensor Networks (DeLP) has reduced the energy utilization by 50% percent, increased the safety period by 26%, while providing a six times more packet delivery ratio along with a further 15% decrease in the packet delivery delay as compared to the tree-based scheme. It also provides 334% more safety period than the phantom routing, while it lags behind in other parameters due to the simplicity of phantom scheme. This work illustrates the privacy protection of the source node and the designed procedure may be useful in designing more robust algorithms for location privac

    Improving the scalability of parallel N-body applications with an event driven constraint based execution model

    Full text link
    The scalability and efficiency of graph applications are significantly constrained by conventional systems and their supporting programming models. Technology trends like multicore, manycore, and heterogeneous system architectures are introducing further challenges and possibilities for emerging application domains such as graph applications. This paper explores the space of effective parallel execution of ephemeral graphs that are dynamically generated using the Barnes-Hut algorithm to exemplify dynamic workloads. The workloads are expressed using the semantics of an Exascale computing execution model called ParalleX. For comparison, results using conventional execution model semantics are also presented. We find improved load balancing during runtime and automatic parallelism discovery improving efficiency using the advanced semantics for Exascale computing.Comment: 11 figure

    HardIDX: Practical and Secure Index with SGX

    Full text link
    Software-based approaches for search over encrypted data are still either challenged by lack of proper, low-leakage encryption or slow performance. Existing hardware-based approaches do not scale well due to hardware limitations and software designs that are not specifically tailored to the hardware architecture, and are rarely well analyzed for their security (e.g., the impact of side channels). Additionally, existing hardware-based solutions often have a large code footprint in the trusted environment susceptible to software compromises. In this paper we present HardIDX: a hardware-based approach, leveraging Intel's SGX, for search over encrypted data. It implements only the security critical core, i.e., the search functionality, in the trusted environment and resorts to untrusted software for the remainder. HardIDX is deployable as a highly performant encrypted database index: it is logarithmic in the size of the index and searches are performed within a few milliseconds rather than seconds. We formally model and prove the security of our scheme showing that its leakage is equivalent to the best known searchable encryption schemes. Our implementation has a very small code and memory footprint yet still scales to virtually unlimited search index sizes, i.e., size is limited only by the general - non-secure - hardware resources

    Single failure resiliency in greedy routing

    Get PDF
    Using greedy routing, network nodes forward packets towards neighbors which are closer to their destination. This approach makes greedy routers significantly more memory-efficient than traditional IP-routers using longest-prefix matching. Greedy embeddings map network nodes to coordinates, such that greedy routing always leads to the destination. Prior works showed that using a spanning tree of the network topology, greedy embeddings can be found in different metric spaces for any graph. However, a single link/node failure might affect the greedy embedding and causes the packets to reach a dead end. In order to cope with network failures, existing greedy methods require large resources and cause significant loss in the quality of the routing (stretch loss). We propose efficient recovery techniques which require very limited resources with minor effect on the stretch. As the proposed techniques are protection, the switch-over takes place very fast. Low overhead, simplicity and scalability of the methods make them suitable for large-scale networks. The proposed schemes are validated on large topologies with properties similar to the Internet. The performances of the schemes are compared with an existing alternative referred as gravity pressure routing
    • …
    corecore