81,172 research outputs found
Algorithmic patterns for -matrices on many-core processors
In this work, we consider the reformulation of hierarchical ()
matrix algorithms for many-core processors with a model implementation on
graphics processing units (GPUs). matrices approximate specific
dense matrices, e.g., from discretized integral equations or kernel ridge
regression, leading to log-linear time complexity in dense matrix-vector
products. The parallelization of matrix operations on many-core
processors is difficult due to the complex nature of the underlying algorithms.
While previous algorithmic advances for many-core hardware focused on
accelerating existing matrix CPU implementations by many-core
processors, we here aim at totally relying on that processor type. As main
contribution, we introduce the necessary parallel algorithmic patterns allowing
to map the full matrix construction and the fast matrix-vector
product to many-core hardware. Here, crucial ingredients are space filling
curves, parallel tree traversal and batching of linear algebra operations. The
resulting model GPU implementation hmglib is the, to the best of the authors
knowledge, first entirely GPU-based Open Source matrix library of
this kind. We conclude this work by an in-depth performance analysis and a
comparative performance study against a standard matrix library,
highlighting profound speedups of our many-core parallel approach
An Enhanced Source Location Privacy based on Data Dissemination in Wireless Sensor Networks (DeLP)
open access articleWireless Sensor Network is a network of large number of nodes with limited power and computational capabilities. It has the potential of event monitoring in unattended locations where there is a chance of unauthorized access. The work that is presented here identifies and addresses the problem of eavesdropping in the exposed environment of the sensor network, which makes it easy for the adversary to trace the packets to find the originator source node, hence compromising the contextual privacy. Our scheme provides an enhanced three-level security system for source location privacy. The base station is at the center of square grid of four quadrants and it is surrounded by a ring of flooding nodes, which act as a first step in confusing the adversary. The fake node is deployed in the opposite quadrant of actual source and start reporting base station. The selection of phantom node using our algorithm in another quadrant provides the third level of confusion. The results show that Dissemination in Wireless Sensor Networks (DeLP) has reduced the energy utilization by 50% percent, increased the safety period by 26%, while providing a six times more packet delivery ratio along with a further 15% decrease in the packet delivery delay as compared to the tree-based scheme. It also provides 334% more safety period than the phantom routing, while it lags behind in other parameters due to the simplicity of phantom scheme. This work illustrates the privacy protection of the source node and the designed procedure may be useful in designing more robust algorithms for location privac
Improving the scalability of parallel N-body applications with an event driven constraint based execution model
The scalability and efficiency of graph applications are significantly
constrained by conventional systems and their supporting programming models.
Technology trends like multicore, manycore, and heterogeneous system
architectures are introducing further challenges and possibilities for emerging
application domains such as graph applications. This paper explores the space
of effective parallel execution of ephemeral graphs that are dynamically
generated using the Barnes-Hut algorithm to exemplify dynamic workloads. The
workloads are expressed using the semantics of an Exascale computing execution
model called ParalleX. For comparison, results using conventional execution
model semantics are also presented. We find improved load balancing during
runtime and automatic parallelism discovery improving efficiency using the
advanced semantics for Exascale computing.Comment: 11 figure
HardIDX: Practical and Secure Index with SGX
Software-based approaches for search over encrypted data are still either
challenged by lack of proper, low-leakage encryption or slow performance.
Existing hardware-based approaches do not scale well due to hardware
limitations and software designs that are not specifically tailored to the
hardware architecture, and are rarely well analyzed for their security (e.g.,
the impact of side channels). Additionally, existing hardware-based solutions
often have a large code footprint in the trusted environment susceptible to
software compromises. In this paper we present HardIDX: a hardware-based
approach, leveraging Intel's SGX, for search over encrypted data. It implements
only the security critical core, i.e., the search functionality, in the trusted
environment and resorts to untrusted software for the remainder. HardIDX is
deployable as a highly performant encrypted database index: it is logarithmic
in the size of the index and searches are performed within a few milliseconds
rather than seconds. We formally model and prove the security of our scheme
showing that its leakage is equivalent to the best known searchable encryption
schemes. Our implementation has a very small code and memory footprint yet
still scales to virtually unlimited search index sizes, i.e., size is limited
only by the general - non-secure - hardware resources
Single failure resiliency in greedy routing
Using greedy routing, network nodes forward packets towards neighbors which are closer to their destination. This approach makes greedy routers significantly more memory-efficient than traditional IP-routers using longest-prefix matching. Greedy embeddings map network nodes to coordinates, such that greedy routing always leads to the destination. Prior works showed that using a spanning tree of the network topology, greedy embeddings can be found in different metric spaces for any graph. However, a single link/node failure might affect the greedy embedding and causes the packets to reach a dead end. In order to cope with network failures, existing greedy methods require large resources and cause significant loss in the quality of the routing (stretch loss). We propose efficient recovery techniques which require very limited resources with minor effect on the stretch. As the proposed techniques are protection, the switch-over takes place very fast. Low overhead, simplicity and scalability of the methods make them suitable for large-scale networks. The proposed schemes are validated on large topologies with properties similar to the Internet. The performances of the schemes are compared with an existing alternative referred as gravity pressure routing
- …