1,148 research outputs found
A network flow model for load balancing in circuit-switched multicomputers
In multicomputers that utilize circuit switching or wormhole routing, communication overhead depends largely on link contention - the variation due to distance between nodes is negligible. This has a major impact on the load balancing problem. In this case, there are some nodes with excess load (sources) and others with deficit load (sinks) and it is required to find a matching of sources to sinks that avoids contention. The problem is made complex by the hardwired routing on currently available machines: the user can control only which nodes communicate but not how the messages are routed. Network flow models of message flow in the mesh and the hypercube were developed to solve this problem. The crucial property of these models is the correspondence between minimum cost flows and correctly routed messages. To solve a given load balancing problem, a minimum cost flow algorithm is applied to the network. This permits one to determine efficiently a maximum contention free matching of sources to sinks which, in turn, tells one how much of the given imbalance can be eliminated without contention
Multiphase complete exchange on a circuit switched hypercube
On a distributed memory parallel computer, the complete exchange (all-to-all personalized) communication pattern requires each of n processors to send a different block of data to each of the remaining n - 1 processors. This pattern is at the heart of many important algorithms, most notably the matrix transpose. For a circuit switched hypercube of dimension d(n = 2(sup d)), two algorithms for achieving complete exchange are known. These are (1) the Standard Exchange approach that employs d transmissions of size 2(sup d-1) blocks each and is useful for small block sizes, and (2) the Optimal Circuit Switched algorithm that employs 2(sup d) - 1 transmissions of 1 block each and is best for large block sizes. A unified multiphase algorithm is described that includes these two algorithms as special cases. The complete exchange on a hypercube of dimension d and block size m is achieved by carrying out k partial exchange on subcubes of dimension d(sub i) Sigma(sup k)(sub i=1) d(sub i) = d and effective block size m(sub i) = m2(sup d-di). When k = d and all d(sub i) = 1, this corresponds to algorithm (1) above. For the case of k = 1 and d(sub i) = d, this becomes the circuit switched algorithm (2). Changing the subcube dimensions d, varies the effective block size and permits a compromise between the data permutation and block transmission overhead of (1) and the startup overhead of (2). For a hypercube of dimension d, the number of possible combinations of subcubes is p(d), the number of partitions of the integer d. This is an exponential but very slowly growing function and it is feasible over these partitions to discover the best combination for a given message size. The approach was analyzed for, and implemented on, the Intel iPSC-860 circuit switched hypercube. Measurements show good agreement with predictions and demonstrate that the multiphase approach can substantially improve performance for block sizes in the 0 to 160 byte range. This range, which corresponds to 0 to 40 floating point numbers per processor, is commonly encountered in practical numeric applications. The multiphase technique is applicable to all circuit-switched hypercubes that use the common e-cube routing strategy
Matter Inheritance Symmetries of Spherically Symmetric Static Spacetimes
In this paper we discuss matter inheritance collineations by giving a
complete classification of spherically symmetric static spacetimes by their
matter inheritance symmetries. It is shown that when the energy-momentum tensor
is degenerate, most of the cases yield infinite dimensional matter inheriting
symmetries. It is worth mentioning here that two cases provide finite
dimensional matter inheriting vectors even for the degenerate case. The
non-degenerate case provides finite dimensional matter inheriting symmetries.
We obtain different constraints on the energy-momentum tensor in each case. It
is interesting to note that if the inheriting factor vanishes, matter
inheriting collineations reduce to be matter collineations already available in
the literature. This idea of matter inheritance collineations turn out to be
the same as homotheties and conformal Killing vectors are for the metric
tensor.Comment: 15 pages, accepted for publication in Int. J. of Mod. Phys.
Efficient algorithms for a class of partitioning problems
The problem of optimally partitioning the modules of chain- or tree-like tasks over chain-structured or host-satellite multiple computer systems is addressed. This important class of problems includes many signal processing and industrial control applications. Prior research has resulted in a succession of faster exact and approximate algorithms for these problems. Polynomial exact and approximate algorithms are described for this class that are better than any of the previously reported algorithms. The approach is based on a preprocessing step that condenses the given chain or tree structured task into a monotonic chain or tree. The partitioning of this monotonic take can then be carried out using fast search techniques
Pre-Service Teachers Awareness of Using WhatsApp as a Pedagogical Tool for The Practicum Program During Coved-19 Pandemic
This study explored the extent to which pre-service teachers were aware of using WhatsApp effectively as a pedagogical tool for educational purposes during the Covid-19 pandemic. Due to the COVID-19 pandemic outbreak, Saudi Arabias education system switched to distance learning; accordingly, there was a growing reliance on information and communication technology (ICT) for online teaching and learning. Furthermore, the increased use of social networks such as WhatsApp could have many benefits and consequences on students’ learning. Twenty-six female preservice teachers (PSTs) made a WhatsApp group to collaborate with their peers for eight weeks during the Covid-19 pandemic. Each week, the PSTs were required to share a minimum of three posts for a total of twenty-four posts. The instructions given to the PSTs focused on sharing useful posts related to their practicum program. A mixed-method research design was used for this study. First, quantitative data were collected by recording the frequency and range of posts to determine the amount of participation. Second, the qualitative data were gathered by conducting focus group interviews to understand the reasons behind each PST’s participation. Findings revealed that the contribution rate of the entire group was high (77%), with 20 PSTs meeting the minimum required number of posts. Remarkably, these 20 PSTs formed a unique norm of the learning community by regulating their own and other peers’ works as well. Only six pre-service teachers did not meet the required number of useful posts. The reasons behind the contributions, more findings, and further research suggestions and recommendations for educational settings are discussed
Shuffle-exchanges on augmented meshes
A mesh connected array of size N = two to the Kth power, K an integer, can be augmented by adding at most one edge per node such that it can perform a shuffle-exchange of size N/2 in constant time. A shuffle-exchange of size N is performed on this augmented array in constant time. This is done by combining the available perfect shuffle of size N/2 with the existing nearest neighbor connections of the mesh. By carefully scheduling the different permutations that are composed in order to achieve the shuffle, the time required is reduced to 5 steps, which is optimal for this network
Multiphase complete exchange: A theoretical analysis
Complete Exchange requires each of N processors to send a unique message to each of the remaining N-1 processors. For a circuit switched hypercube with N = 2(sub d) processors, the Direct and Standard algorithms for Complete Exchange are optimal for very large and very small message sizes, respectively. For intermediate sizes, a hybrid Multiphase algorithm is better. This carries out Direct exchanges on a set of subcubes whose dimensions are a partition of the integer d. The best such algorithm for a given message size m could hitherto only be found by enumerating all partitions of d. The Multiphase algorithm is analyzed assuming a high performance communication network. It is proved that only algorithms corresponding to equipartitions of d (partitions in which the maximum and minimum elements differ by at most 1) can possibly be optimal. The run times of these algorithms plotted against m form a hull of optimality. It is proved that, although there is an exponential number of partitions, (1) the number of faces on this hull is Theta(square root of d), (2) the hull can be found in theta(square root of d) time, and (3) once it has been found, the optimal algorithm for any given m can be found in Theta(log d) time. These results provide a very fast technique for minimizing communication overhead in many important applications, such as matrix transpose, Fast Fourier transform, and ADI
Multiphase complete exchange on Paragon, SP2 and CS-2
The overhead of interprocessor communication is a major factor in limiting the performance of parallel computer systems. The complete exchange is the severest communication pattern in that it requires each processor to send a distinct message to every other processor. This pattern is at the heart of many important parallel applications. On hypercubes, multiphase complete exchange has been developed and shown to provide optimal performance over varying message sizes. Most commercial multicomputer systems do not have a hypercube interconnect. However, they use special purpose hardware and dedicated communication processors to achieve very high performance communication and can be made to emulate the hypercube quite well. Multiphase complete exchange has been implemented on three contemporary parallel architectures: the Intel Paragon, IBM SP2 and Meiko CS-2. The essential features of these machines are described and their basic interprocessor communication overheads are discussed. The performance of multiphase complete exchange is evaluated on each machine. It is shown that the theoretical ideas developed for hypercubes are also applicable in practice to these machines and that multiphase complete exchange can lead to major savings in execution time over traditional solutions
The Linux operating system: An introduction
Linux is a Unix-like operating system for Intel 386/486/Pentium based IBM-PCs and compatibles. The kernel of this operating system was written from scratch by Linus Torvalds and, although copyrighted by the author, may be freely distributed. A world-wide group has collaborated in developing Linux on the Internet. Linux can run the powerful set of compilers and programming tools of the Free Software Foundation, and XFree86, a port of the X Window System from MIT. Most capabilities associated with high performance workstations, such as networking, shared file systems, electronic mail, TeX, LaTeX, etc. are freely available for Linux. It can thus transform cheap IBM-PC compatible machines into Unix workstations with considerable capabilities. The author explains how Linux may be obtained, installed and networked. He also describes some interesting applications for Linux that are freely available. The enormous consumer market for IBM-PC compatible machines continually drives down prices of CPU chips, memory, hard disks, CDROMs, etc. Linux can convert such machines into powerful workstations that can be used for teaching, research and software development. For professionals who use Unix based workstations at work, Linux permits virtually identical working environments on their personal home machines. For cost conscious educational institutions Linux can create world-class computing environments from cheap, easily maintained, PC clones. Finally, for university students, it provides an essentially cost-free path away from DOS into the world of Unix and X Windows
- …