52 research outputs found

    A powerful heuristic for telephone gossiping

    Get PDF
    A refined heuristic for computing schedules for gossiping in the telephone model is presented. The heuristic is fast: for a network with n nodes and m edges, requiring R rounds for gossiping, the running time is O(R n log(n) m) for all tested classes of graphs. This moderate time consumption allows to compute gossiping schedules for networks with more than 10,000 PUs and 100,000 connections. The heuristic is good: in practice the computed schedules never exceed the optimum by more than a few rounds. The heuristic is versatile: it can also be used for broadcasting and more general information dispersion patterns. It can handle both the unit-cost and the linear-cost model. Actually, the heuristic is so good, that for CCC, shuffle-exchange, butterfly de Bruijn, star and pancake networks the constructed gossiping schedules are better than the best theoretically derived ones. For example, for gossiping on a shuffle-exchange network with 2^{13} PUs, the former upper bound was 49 rounds, while our heuristic finds a schedule requiring 31 rounds. Also for broadcasting the heuristic improves on many formerly known results. A second heuristic, works even better for CCC, butterfly, star and pancake networks. For example, with this heuristic we found that gossiping on a pancake network with 7! PUs can be performed in 15 rounds, 2 fewer than achieved by the best theoretical construction. This second heuristic is less versatile than the first, but by refined search techniques it can tackle even larger problems, the main limitation being the storage capacity. Another advantage is that the constructed schedules can be represented concisely

    Simulation Of Multi-core Systems And Interconnections And Evaluation Of Fat-Mesh Networks

    Get PDF
    Simulators are very important in computer architecture research as they enable the exploration of new architectures to obtain detailed performance evaluation without building costly physical hardware. Simulation is even more critical to study future many-core architectures as it provides the opportunity to assess currently non-existing computer systems. In this thesis, a multiprocessor simulator is presented based on a cycle accurate architecture simulator called SESC. The shared L2 cache system is extended into a distributed shared cache (DSC) with a directory-based cache coherency protocol. A mesh network module is extended and integrated into SESC to replace the bus for scalable inter-processor communication. While these efforts complete an extended multiprocessor simulation infrastructure, two interconnection enhancements are proposed and evaluated. A novel non-uniform fat-mesh network structure similar to the idea of fat-tree is proposed. This non-uniform mesh network takes advantage of the average traffic pattern, typically all-to-all in DSC, to dedicate additional links for connections with heavy traffic (e.g., near the center) and fewer links for lighter traffic (e.g., near the periphery). Two fat-mesh schemes are implemented based on different routing algorithms. Analytical fat-mesh models are constructed by presenting the expressions for the traffic requirements of personalized all-to-all traffic. Performance improvements over the uniform mesh are demonstrated in the results from the simulator. A hybrid network consisting of one packet switching plane and multiple circuit switching planes is constructed as the second enhancement. The circuit switching planes provide fast paths between neighbors with heavy communication traffic. A compiler technique that abstracts the symbolic expressions of benchmarks' communication patterns can be used to help facilitate the circuit establishment

    Heuristics for Message Broadcasting in Arbitrary Networks

    Get PDF
    With the increasing popularity of interconnection networks, efficient information dissemination has become a popular research area. Broadcasting is one of the information dissemination primitives. Finding the optimal broadcasting scheme for any originator in an arbitrary network has been proved to be an NP-Hard problem. In this thesis, two new heuristics that generate broadcast schemes in arbitrary networks are presented. Both of them have O(|E|) time complexity. Moreover, in the broadcast schemes generated by the heuristics, each vertex in the network receives the message via a shortest path. Based on computer simulations of these heuristics in some commonly used topologies and network models, and comparing the results with the best existing heuristics, we conclude that the new heuristics show comparable performances while having lower complexity

    CiFHER: A Chiplet-Based FHE Accelerator with a Resizable Structure

    Full text link
    Fully homomorphic encryption (FHE) is in the spotlight as a definitive solution for privacy, but the high computational overhead of FHE poses a challenge to its practical adoption. Although prior studies have attempted to design ASIC accelerators to mitigate the overhead, their designs require excessive amounts of chip resources (e.g., areas) to contain and process massive data for FHE operations. We propose CiFHER, a chiplet-based FHE accelerator with a resizable structure, to tackle the challenge with a cost-effective multi-chip module (MCM) design. First, we devise a flexible architecture of a chiplet core whose configuration can be adjusted to conform to the global organization of chiplets and design constraints. The distinctive feature of our core is a recomposable functional unit providing varying computational throughput for number-theoretic transform (NTT), the most dominant function in FHE. Then, we establish generalized data mapping methodologies to minimize the network overhead when organizing the chips into the MCM package in a tiled manner, which becomes a significant bottleneck due to the technology constraints of MCMs. Also, we analyze the effectiveness of various algorithms, including a novel limb duplication algorithm, on the MCM architecture. A detailed evaluation shows that a CiFHER package composed of 4 to 64 compact chiplets provides performance comparable to state-of-the-art monolithic ASIC FHE accelerators with significantly lower package-wide power consumption while reducing the area of a single core to as small as 4.28mm2^2.Comment: 15 pages, 9 figure

    Models and Solution Techniques for Frequency Assignment Problems

    Get PDF
    Wireless communication is used in many different situations such as mobile telephony, radio and TV broadcasting, satellite communication, and military operations. In each of these situations a frequency assignment problem arises with application specific characteristics. Researchers have developed different modeling ideas for each of the features of the problem, such as the handling of interference among radio signals, the availability of frequencies, and the optimization criterion. This survey gives an overview of the models and methods that the literature provides on the topic. We present a broad description of the practical settings in which frequency assignment is applied. We also present a classification of the different models and formulations described in the literature, such that the common features of the models are emphasized. The solution methods are divided in two parts. Optimization and lower bounding techniques on the one hand, and heuristic search techniques on the other hand. The literature is classified according to the used methods. Again, we emphasize the common features, used in the different papers. The quality of the solution methods is compared, whenever possible, on publicly available benchmark instances

    A general upper bound on broadcast function B(n) using Knodel graph

    Get PDF
    Broadcasting in a graph is the process of transmitting a message from one vertex, the originator, to all other vertices of the graph. We will consider the classical model in which an informed vertex can only inform one of its uninformed neighbours during each time unit. A broadcast graph on n vertices is a graph in which broadcasting can be completed in ceiling of log n to the base 2 time units from any originator. A minimum broadcast graph on n vertices is a broadcast graph that has the least possible number of edges, B(n), over all broadcast graphs on n vertices. This thesis enhances studies about broadcasting by applying a vertex deletion method to a specific graph topology, namely Knodel graph, in order to construct broadcast graphs on odd number of vertices. This construction provides an improved general upper bound on B(n) for all odd n except when n=2^k−1

    Models and solution techniques for frequency assignment problems

    Get PDF
    • …
    corecore