915 research outputs found

    Efficient Multicast in Heterogeneous Networks of Workstations

    Get PDF
    This paper studies the problem of efficient multicast in heterogeneous networks of workstations (HNOWs) using a parameterized communication model [3]. This model associates a sending overhead and a receiving overhead with each node as well as a network latency parameter. The problem of finding optimal multicasts in this model is known to be NP-complete in the strong sense. Nevertheless, we show that for two different properties that arise in typical HNOWs, provably near-optimal and optimal solutions, respectively, can be found in polynomial time. Specifically, we show the following two results: When the ratios of receiving overhead to sending overhead among the nodes is bounded by constants, solutions within a bounded ratio of optimal can be found in time O(n log n). Secondly, if the number of distinct types of workstations is fixed then optimal solutions can be found in polynomial time. These results provide a practical means of finding optimal and provably near-optimal multicast schedules in a large class of frequently occurring heterogeneous networks of workstations

    The ISIS Project: Real Experience with a Fault Tolerant Programming System

    Get PDF
    The ISIS project has developed a distributed programming toolkit and a collection of higher level applications based on these tools. ISIS is now in use at more than 300 locations world-wise. The lessons (and surprises) gained from this experience with the real world are discussed

    Parallel matrix multiplication on heterogeneous networks of workstations

    Get PDF
    Matrix multiplication is taken as a test bed for parallel processing on heterogeneous networks of workstations (local area networks) used as parallel machines. Two algorithms are proposed taking into account the specific kind of parallel hardware provided by local area networks, and experimentation is used to drive the evaluation and identification of possible performance loss. A specific broadcast communication between processes of a parallel application is also proposed, taking advantage of the Ethernet interconnection network to achieve optimized performance. A special emphasis is place on already installed networks of workstations, which provide a hardware zero cost parallel computer; but a homogeneous Beowulf-class system is used to show how the algorithms are also useful on current classical high performance parallel computing with clusters.Eje: LenguajesRed de Universidades con Carreras en Informática (RedUNCI

    Advanced Message Routing for Scalable Distributed Simulations

    Get PDF
    The Joint Forces Command (JFCOM) Experimentation Directorate (J9)'s recent Joint Urban Operations (JUO) experiments have demonstrated the viability of Forces Modeling and Simulation in a distributed environment. The JSAF application suite, combined with the RTI-s communications system, provides the ability to run distributed simulations with sites located across the United States, from Norfolk, Virginia to Maui, Hawaii. Interest-aware routers are essential for communications in the large, distributed environments, and the current RTI-s framework provides such routers connected in a straightforward tree topology. This approach is successful for small to medium sized simulations, but faces a number of significant limitations for very large simulations over high-latency, wide area networks. In particular, traffic is forced through a single site, drastically increasing distances messages must travel to sites not near the top of the tree. Aggregate bandwidth is limited to the bandwidth of the site hosting the top router, and failures in the upper levels of the router tree can result in widespread communications losses throughout the system. To resolve these issues, this work extends the RTI-s software router infrastructure to accommodate more sophisticated, general router topologies, including both the existing tree framework and a new generalization of the fully connected mesh topologies used in the SF Express ModSAF simulations of 100K fully interacting vehicles. The new software router objects incorporate the scalable features of the SF Express design, while optionally using low-level RTI-s objects to perform actual site-to-site communications. The (substantial) limitations of the original mesh router formalism have been eliminated, allowing fully dynamic operations. The mesh topology capabilities allow aggregate bandwidth and site-to-site latencies to match actual network performance. The heavy resource load at the root node can now be distributed across routers at the participating sites

    Scalable and Reliable File Transfer for Clusters Using Multicast.

    Get PDF
    A cluster is a group of computing resources that are connected by a single computer network and are managed as a single system. Clusters potentially have three key advantages over workstations operated in isolation—fault tolerance, load balancing and support for distributed computing. Information sharing among the cluster’s resources affects all phases of cluster administration. The thesis describes a new tool for distributing files within clusters. This tool, the Scalable and Reliable File Transfer Tool (SRFTT), uses Forward Error Correction (FEC) and multiple multicast channels to achieve an efficient reliable file transfer, relative to heterogeneous clusters. SRFTT achieves scalability by avoiding feedback from the receivers. Tests show that, for large files, retransmitting recovery information on multiple multicast channels gives significant performance gains when compared to a single retransmission channel

    A Policy-Based Resource Brokering Environment for Computational Grids

    Get PDF
    With the advances in networking infrastructure in general, and the Internet in particular, we can build grid environments that allow users to utilize a diverse set of distributed and heterogeneous resources. Since the focus of such environments is the efficient usage of the underlying resources, a critical component is the resource brokering environment that mediates the discovery, access and usage of these resources. With the consumer\u27s constraints, provider\u27s rules, distributed heterogeneous resources and the large number of scheduling choices, the resource brokering environment needs to decide where to place the user\u27s jobs and when to start their execution in a way that yields the best performance for the user and the best utilization for the resource provider. As brokering and scheduling are very complicated tasks, most current resource brokering environments are either specific to a particular grid environment or have limited features. This makes them unsuitable for large applications with heterogeneous requirements. In addition, most of these resource brokering environments lack flexibility. Policies at the resource-, application-, and system-levels cannot be specified and enforced to provide commitment to the guaranteed level of allocation that can help in attracting grid users and contribute to establishing credibility for existing grid environments. In this thesis, we propose and prototype a flexible and extensible Policy-based Resource Brokering Environment (PROBE) that can be utilized by various grid systems. In designing PROBE, we follow a policy-based approach that provides PROBE with the intelligence to not only match the user\u27s request with the right set of resources, but also to assure the guaranteed level of the allocation. PROBE looks at the task allocation as a Service Level Agreement (SLA) that needs to be enforced between the resource provider and the resource consumer. The policy-based framework is useful in a typical grid environment where resources, most of the time, are not dedicated. In implementing PROBE, we have utilized a layered architecture and façade design patterns. These along with the well-defined API, make the framework independent of any architecture and allow for the incorporation of different types of scheduling algorithms, applications and platform adaptors as the underlying environment requires. We have utilized XML as a base for all the specification needs. This provides a flexible mechanism to specify the heterogeneous resources and user\u27s requests along with their allocation constraints. We have developed XML-based specifications by which high-level internal structures of resources, jobs and policies can be specified. This provides interoperability in which a grid system can utilize PROBE to discover and use resources controlled by other grid systems. We have implemented a prototype of PROBE to demonstrate its feasibility. We also describe a test bed environment and the evaluation experiments that we have conducted to demonstrate the usefulness and effectiveness of our approach

    Mobile object location discovery in unpredictable environments

    Get PDF
    Emerging mobile and ubiquitous computing environments present hard challenges to software engineering. The use of mobile code has been suggested as a natural fit for simplifing software development for these environments. However, the task of discovering mobile code location becomes a problem in unpredictable environments when using existing strategies, designed with fixed and relatively stable networks in mind. This paper introduces AMOS, a mobile code platform augmented with a structured overlay network. We demonstrate how the location discovery strategy of AMOS has better reliability and scalability properties than existing approaches, with minimal communication overhead. Finally, we demonstrate how AMOS can provide autonomous distribution of effort fairly throughout a network using probabilistic methods that requires no global knowledge of host capabilities
    • …
    corecore