3,273 research outputs found

    Multiprocessing the Sieve of Eratosthenes

    Get PDF
    The Sieve of Eratosthenes for finding prime numbers in recent years has seen much use as a benchmark algorithm for serial computers while its intrinsically parallel nature has gone largely unnoticed. The implementation of a parallel version of this algorithm for a real parallel computer, the Flex/32, is described and its performance discussed. It is shown that the algorithm is sensitive to several fundamental performance parameters of parallel machines, such as spawning time, signaling time, memory access, and overhead of process switching. Because of the nature of the algorithm, it is impossible to get any speedup beyond 4 or 5 processors unless some form of dynamic load balancing is employed. We describe the performance of our algorithm with and without load balancing and compare it with theoretical lower bounds and simulated results. It is straightforward to understand this algorithm and to check the final results. However, its efficient implementation on a real parallel machine requires thoughtful design, especially if dynamic load balancing is desired. The fundamental operations required by the algorithm are very simple: this means that the slightest overhead appears prominently in performance data. The Sieve thus serves not only as a very severe test of the capabilities of a parallel processor but is also an interesting challenge for the programmer

    Partitioning problems in parallel, pipelined and distributed computing

    Get PDF
    The problem of optimally assigning the modules of a parallel program over the processors of a multiple computer system is addressed. A Sum-Bottleneck path algorithm is developed that permits the efficient solution of many variants of this problem under some constraints on the structure of the partitions. In particular, the following problems are solved optimally for a single-host, multiple satellite system: partitioning multiple chain structured parallel programs, multiple arbitrarily structured serial programs and single tree structured parallel programs. In addition, the problems of partitioning chain structured parallel programs across chain connected systems and across shared memory (or shared bus) systems are also solved under certain constraints. All solutions for parallel programs are equally applicable to pipelined programs. These results extend prior research in this area by explicitly taking concurrency into account and permit the efficient utilization of multiple computer architectures for a wide range of problems of practical interest

    A network flow model for load balancing in circuit-switched multicomputers

    Get PDF
    In multicomputers that utilize circuit switching or wormhole routing, communication overhead depends largely on link contention - the variation due to distance between nodes is negligible. This has a major impact on the load balancing problem. In this case, there are some nodes with excess load (sources) and others with deficit load (sinks) and it is required to find a matching of sources to sinks that avoids contention. The problem is made complex by the hardwired routing on currently available machines: the user can control only which nodes communicate but not how the messages are routed. Network flow models of message flow in the mesh and the hypercube were developed to solve this problem. The crucial property of these models is the correspondence between minimum cost flows and correctly routed messages. To solve a given load balancing problem, a minimum cost flow algorithm is applied to the network. This permits one to determine efficiently a maximum contention free matching of sources to sinks which, in turn, tells one how much of the given imbalance can be eliminated without contention

    Multiphase complete exchange on a circuit switched hypercube

    Get PDF
    On a distributed memory parallel computer, the complete exchange (all-to-all personalized) communication pattern requires each of n processors to send a different block of data to each of the remaining n - 1 processors. This pattern is at the heart of many important algorithms, most notably the matrix transpose. For a circuit switched hypercube of dimension d(n = 2(sup d)), two algorithms for achieving complete exchange are known. These are (1) the Standard Exchange approach that employs d transmissions of size 2(sup d-1) blocks each and is useful for small block sizes, and (2) the Optimal Circuit Switched algorithm that employs 2(sup d) - 1 transmissions of 1 block each and is best for large block sizes. A unified multiphase algorithm is described that includes these two algorithms as special cases. The complete exchange on a hypercube of dimension d and block size m is achieved by carrying out k partial exchange on subcubes of dimension d(sub i) Sigma(sup k)(sub i=1) d(sub i) = d and effective block size m(sub i) = m2(sup d-di). When k = d and all d(sub i) = 1, this corresponds to algorithm (1) above. For the case of k = 1 and d(sub i) = d, this becomes the circuit switched algorithm (2). Changing the subcube dimensions d, varies the effective block size and permits a compromise between the data permutation and block transmission overhead of (1) and the startup overhead of (2). For a hypercube of dimension d, the number of possible combinations of subcubes is p(d), the number of partitions of the integer d. This is an exponential but very slowly growing function and it is feasible over these partitions to discover the best combination for a given message size. The approach was analyzed for, and implemented on, the Intel iPSC-860 circuit switched hypercube. Measurements show good agreement with predictions and demonstrate that the multiphase approach can substantially improve performance for block sizes in the 0 to 160 byte range. This range, which corresponds to 0 to 40 floating point numbers per processor, is commonly encountered in practical numeric applications. The multiphase technique is applicable to all circuit-switched hypercubes that use the common e-cube routing strategy

    QuaDMutEx: quadratic driver mutation explorer

    Get PDF
    Background Somatic mutations accumulate in human cells throughout life. Some may have no adverse consequences, but some of them may lead to cancer. A cancer genome is typically unstable, and thus more mutations can accumulate in the DNA of cancer cells. An ongoing problem is to figure out which mutations are drivers - play a role in oncogenesis, and which are passengers - do not play a role. One way of addressing this question is through inspection of somatic mutations in DNA of cancer samples from a cohort of patients and detection of patterns that differentiate driver from passenger mutations. Results We propose QuaDMutEx, a method that incorporates three novel elements: a new gene set penalty that includes non-linear penalization of multiple mutations in putative sets of driver genes, an ability to adjust the method to handle slow- and fast-evolving tumors, and a computationally efficient method for finding gene sets that minimize the penalty, through a combination of heuristic Monte Carlo optimization and exact binary quadratic programming. Compared to existing methods, the proposed algorithm finds sets of putative driver genes that show higher coverage and lower excess coverage in eight sets of cancer samples coming from brain, ovarian, lung, and breast tumors. Conclusions Superior ability to improve on both coverage and excess coverage on different types of cancer shows that QuaDMutEx is a tool that should be part of a state-of-the-art toolbox in the driver gene discovery pipeline. It can detect genes harboring rare driver mutations that may be missed by existing methods. QuaDMutEx is available for download from https://github.com/bokhariy/QuaDMutEx under the GNU GPLv3 license

    Matter Inheritance Symmetries of Spherically Symmetric Static Spacetimes

    Full text link
    In this paper we discuss matter inheritance collineations by giving a complete classification of spherically symmetric static spacetimes by their matter inheritance symmetries. It is shown that when the energy-momentum tensor is degenerate, most of the cases yield infinite dimensional matter inheriting symmetries. It is worth mentioning here that two cases provide finite dimensional matter inheriting vectors even for the degenerate case. The non-degenerate case provides finite dimensional matter inheriting symmetries. We obtain different constraints on the energy-momentum tensor in each case. It is interesting to note that if the inheriting factor vanishes, matter inheriting collineations reduce to be matter collineations already available in the literature. This idea of matter inheritance collineations turn out to be the same as homotheties and conformal Killing vectors are for the metric tensor.Comment: 15 pages, accepted for publication in Int. J. of Mod. Phys.

    Moderations among Salafists & Jihadists

    Get PDF

    Returns to Specialization, Transaction Costs, and the Dynamics of Industry Evolution

    Get PDF
    When more than one component or activity is needed to produce the final product, a firm may use proprietary standards or adopt a common standard to integrate these components. We call these closed and open firms respectively, and develop a model of industry evolution to study the process by which type of firm comes to dominate the industry. Our simulations show that an industry may diverge from its long run equilibrium configuration for sustained periods of time. Typically, the industry is dominated by closed firms in the early history and by open firms later on. Entry and exit dynamics create transient biases in favor of open firms. First, a closed entrant can capture multiple profits whereas an open entrant faces a lower entry barrier. However, while the odds of closed entry (relative to open entry) are initially greater than one, they decrease with price and eventually open entry becomes more likely than closed entry. Second, though initially closed firms can offset losses in one component with profits from another and thereby have better survival as compared to open firms, when prices fall below a threshold level, a closed firm is more likely to exit than a comparable pair of open firms. Finally, entry by an open firm improves the relative odds of entry by a complementary open firm, especially when the two complementary sectors differ in size or efficiency.Vertical Integration, Externalities, Positive Feedback, Industry Evolution, Transaction Costs, Simulation Models

    THE OPTIMAL DEMAND FOR FOREIGN EXCHANGE RESERVES IN PAKISTAN

    Get PDF
    Using monthly data on foreign exchange reserves from June 1995 through June 2005, we find that in line with other country-specific studies the opportunity cost of holding reserves played a greater role than reserve volatility in determining the level of reserves in Pakistan. Our finding is in contrast with the hypothesis of increased capital mobility that is commonly set forth in explaining the precautionary motive for reserve holdings. As also pointed by Ramachandran (2004), this result could perhaps be attributed to the fact that capital outflow in Pakistan (as also in India) is not as free as capital inflow and a large part of the recent reserve accumulation is due to non-debt reserve inflows.Foreign Exchange Reserves, Optimal Demand, GARCH, Pakistan
    corecore