7,225 research outputs found
Fast Freenet: Improving Freenet Performance by Preferential Partition Routing and File Mesh Propagation
The Freenet Peer-to-Peer network is doing a good job
in providing anonymity to the users. But the performance
of the network in terms of download speed and request hit
ratio is not that good.
We propose two modifications to Freenet in order to improve
the download speed and request hit ratio for all participants.
To improve download speed we propose Preferential
Partition Routing, where nodes are grouped according
to bandwidth and slow nodes are discriminated when routing.
For improvements in request hit ratio we propose File
Mesh propagation where each node sends fuzzy information
about what documents it posesses to its neigbors.
To verify our proposals we simulate the Freenet network
and the bandwidth restrictions present between nodes as
well as using observed distributions for user actions to show
how it affects the network.
Our results show an improvement of the request hit ratio
by over 30 times and an increase of the average download
speed with six times, compared to regular Freenet routing
Advanced Message Routing for Scalable Distributed Simulations
The Joint Forces Command (JFCOM) Experimentation Directorate (J9)'s recent Joint Urban Operations (JUO)
experiments have demonstrated the viability of Forces Modeling and Simulation in a distributed environment. The
JSAF application suite, combined with the RTI-s communications system, provides the ability to run distributed
simulations with sites located across the United States, from Norfolk, Virginia to Maui, Hawaii. Interest-aware
routers are essential for communications in the large, distributed environments, and the current RTI-s framework
provides such routers connected in a straightforward tree topology. This approach is successful for small to medium
sized simulations, but faces a number of significant limitations for very large simulations over high-latency, wide
area networks. In particular, traffic is forced through a single site, drastically increasing distances messages must
travel to sites not near the top of the tree. Aggregate bandwidth is limited to the bandwidth of the site hosting the
top router, and failures in the upper levels of the router tree can result in widespread communications losses
throughout the system.
To resolve these issues, this work extends the RTI-s software router infrastructure to accommodate more
sophisticated, general router topologies, including both the existing tree framework and a new generalization of the
fully connected mesh topologies used in the SF Express ModSAF simulations of 100K fully interacting vehicles.
The new software router objects incorporate the scalable features of the SF Express design, while optionally using
low-level RTI-s objects to perform actual site-to-site communications. The (substantial) limitations of the original
mesh router formalism have been eliminated, allowing fully dynamic operations. The mesh topology capabilities
allow aggregate bandwidth and site-to-site latencies to match actual network performance. The heavy resource load at
the root node can now be distributed across routers at the participating sites
NoCo: ILP-based worst-case contention estimation for mesh real-time manycores
Manycores are capable of providing the computational demands required by functionally-advanced critical applications in domains such as automotive and avionics. In manycores a network-on-chip (NoC) provides access to shared caches and memories and hence concentrates most of the contention that tasks suffer, with effects on the worst-case contention delay (WCD) of packets and tasks' WCET. While several proposals minimize the impact of individual NoC parameters on WCD, e.g. mapping and routing, there are strong dependences among these NoC parameters. Hence, finding the optimal NoC configurations requires optimizing all parameters simultaneously, which represents a multidimensional optimization problem. In this paper we propose NoCo, a novel approach that combines ILP and stochastic optimization to find NoC configurations in terms of packet routing, application mapping, and arbitration weight allocation. Our results show that NoCo improves other techniques that optimize a subset of NoC parameters.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under grant TIN2015-
65316-P and the HiPEAC Network of Excellence. It also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (agreement No. 772773). Carles Hernández
is jointly supported by the MINECO and FEDER funds
through grant TIN2014-60404-JIN. Jaume Abella has been
partially supported by the Spanish Ministry of Economy and
Competitiveness under Ramon y Cajal postdoctoral fellowship
number RYC-2013-14717. Enrico Mezzetti has been partially
supported by the Spanish Ministry of Economy and Competitiveness
under Juan de la Cierva-Incorporaci´on postdoctoral
fellowship number IJCI-2016-27396.Peer ReviewedPostprint (author's final draft
3D video coding and transmission
The capture, transmission, and display of
3D content has gained a lot of attention in the last few
years. 3D multimedia content is no longer con fined to
cinema theatres but is being transmitted using stereoscopic
video over satellite, shared on Blu-RayTMdisks,
or sent over Internet technologies. Stereoscopic displays
are needed at the receiving end and the viewer needs to
wear special glasses to present the two versions of the
video to the human vision system that then generates
the 3D illusion. To be more e ffective and improve the
immersive experience, more views are acquired from a
larger number of cameras and presented on di fferent displays,
such as autostereoscopic and light field displays.
These multiple views, combined with depth data, also
allow enhanced user experiences and new forms of interaction
with the 3D content from virtual viewpoints.
This type of audiovisual information is represented by a
huge amount of data that needs to be compressed and
transmitted over bandwidth-limited channels. Part of
the COST Action IC1105 \3D Content Creation, Coding
and Transmission over Future Media Networks" (3DConTourNet)
focuses on this research challenge.peer-reviewe
Improving performance guarantees in wormhole mesh NoC designs
Wormhole-based mesh Networks-on-Chip (wNoC) are deployed in high-performance many-core processors due to their physical scalability and low-cost. Delivering tight and time composable Worst-Case Execution Time (WCET) estimates for applications as needed in safety-critical real-time embedded systems is challenged by wNoCs due to their distributed nature. We propose a bandwidth control mechanism for wNoCs that enables the computation of tight time-composable WCET estimates with low average performance degradation and high scalability. Our evaluation
with the EEMBC automotive suite and an industrial real-time parallel avionics application confirms so.The research leading to these results is funded by the European Union Seventh
Framework Programme under grant agreement no. 287519 (parMERASA)
and by the Ministry of Science and Technology of Spain under contract TIN2012-34557. Milos Panic is funded by the Spanish Ministry of Education under the FPU grant FPU12/05966. Carles Hernández is jointly funded by the
Spanish Ministry of Economy and Competitiveness and FEDER funds through
grant TIN2014-60404-JIN. Jaume Abella is partially supported by the Ministry
of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship
number RYC-2013-14717.Peer ReviewedPostprint (author's final draft
Achieving Efficient Strong Scaling with PETSc using Hybrid MPI/OpenMP Optimisation
The increasing number of processing elements and decreas- ing memory to core
ratio in modern high-performance platforms makes efficient strong scaling a key
requirement for numerical algorithms. In order to achieve efficient scalability
on massively parallel systems scientific software must evolve across the entire
stack to exploit the multiple levels of parallelism exposed in modern
architectures. In this paper we demonstrate the use of hybrid MPI/OpenMP
parallelisation to optimise parallel sparse matrix-vector multiplication in
PETSc, a widely used scientific library for the scalable solution of partial
differential equations. Using large matrices generated by Fluidity, an open
source CFD application code which uses PETSc as its linear solver engine, we
evaluate the effect of explicit communication overlap using task-based
parallelism and show how to further improve performance by explicitly load
balancing threads within MPI processes. We demonstrate a significant speedup
over the pure-MPI mode and efficient strong scaling of sparse matrix-vector
multiplication on Fujitsu PRIMEHPC FX10 and Cray XE6 systems
- …