83 research outputs found

    Breaking the Barrier Of 2 for the Competitiveness of Longest Queue Drop

    Get PDF
    We consider the problem of managing the buffer of a shared-memory switch that transmits packets of unit value. A shared-memory switch consists of an input port, a number of output ports, and a buffer with a specific capacity. In each time step, an arbitrary number of packets arrive at the input port, each packet designated for one output port. Each packet is added to the queue of the respective output port. If the total number of packets exceeds the capacity of the buffer, some packets have to be irrevocably rejected. At the end of each time step, each output port transmits a packet in its queue and the goal is to maximize the number of transmitted packets. The Longest Queue Drop (LQD) online algorithm accepts any arriving packet to the buffer. However, if this results in the buffer exceeding its memory capacity, then LQD drops a packet from the back of whichever queue is currently the longest, breaking ties arbitrarily. The LQD algorithm was first introduced in 1991, and is known to be 2-competitive since 2001. Although LQD remains the best known online algorithm for the problem and is of practical interest, determining its true competitiveness is a long-standing open problem. We show that LQD is 1.707-competitive, establishing the first (2-?) upper bound for the competitive ratio of LQD, for a constant ? > 0

    Performance mapping of a class of fully decoupled architecture

    Get PDF

    Breaking the barrier of 2 for the competitiveness of longest queue drop

    Get PDF
    We consider the problem of managing the buffer of a shared-memory switch that transmits packets of unit value. A shared-memory switch consists of an input port, a number of output ports, and a buffer with a specific capacity. In each time step, an arbitrary number of packets arrive at the input port, each packet designated for one output port. Each packet is added to the queue of the respective output port. If the total number of packets exceeds the capacity of the buffer, some packets have to be irrevocably rejected. At the end of each time step, each output port transmits a packet in its queue and the goal is to maximize the number of transmitted packets. The Longest Queue Drop (LQD) online algorithm accepts any arriving packet to the buffer. However, if this results in the buffer exceeding its memory capacity, then LQD drops a packet from the back of whichever queue is currently the longest, breaking ties arbitrarily. The LQD algorithm was first introduced in 1991, and is known to be 2-competitive since 2001. Although LQD remains the best known online algorithm for the problem and is of practical interest, determining its true competitiveness is a long-standing open problem. We show that LQD is 1.707-competitive, establishing the first (2-ε) upper bound for the competitive ratio of LQD, for a constant ε > 0

    Scheduling Architectures for DiffServ Networks with Input Queuing Switches

    Full text link
    ue to its simplicity and scalability, the differentiated services (DiffServ) model is expected to be widely deployed across wired and wireless networks. Though supporting DiffServ scheduling algorithms for output-queuing (OQ) switches have been widely studied, there are few DiffServ scheduling algorithms for input-queuing (IQ) switches in the literaure. In this paper, we propose two algorithms for scheduling DiffServ DiffServ networks with IQ switches: the dynamic DiffServ scheduling (DDS) algorithm and the hierarchical DiffServ scheduling (HDS) algorithm. The basic idea of DDS and HDS is to schedule EF and AF traffic According to Their minimum service rates with the reserved bandwidth and schedule AF and BE traffic fairly with the excess bandwidth. Both DDS and HDS find a maximal weight matching but in different ways. DDS employs a Centralized scheduling scheme. HDS features a hierarchical scheduling scheme That Consists of two levels of schedulers: the central scheduler and port schedulers. Using such a hierarchical scheme, the Implementation complexity and the amount of information needs to be Transmitted between input ports and the central scheduler for HDS are dramatically reduced Compared with DDS. Through simulations, we show That both DDS and HDS popup Guarantees a minimum bandwidth for EF and AF traffic, as well as fair bandwidth allocation for BE traffic. The delay and jitter performance of the DDS is close to That of PQWRR, an existing DiffServ supporting scheduling algorithm for OQ switches. The tradeoff of the simpler Implementation scheme of HDS is its slightly worse delay performance Compared with DDS

    Design of a distributed memory unit for clustered microarchitectures

    Get PDF
    Power constraints led to the end of exponential growth in single–processor performance, which characterized the semiconductor industry for many years. Single–chip multiprocessors allowed the performance growth to continue so far. Yet, Amdahl’s law asserts that the overall performance of future single–chip multiprocessors will depend crucially on single–processor performance. In a multiprocessor a small growth in single–processor performance can justify the use of significant resources. Partitioning the layout of critical components can improve the energy–efficiency and ultimately the performance of a single processor. In a clustered microarchitecture parts of these components form clusters. Instructions are processed locally in the clusters and benefit from the smaller size and complexity of the clusters components. Because the clusters together process a single instruction stream communications between clusters are necessary and introduce an additional cost. This thesis proposes the design of a distributed memory unit and first level cache in the context of a clustered microarchitecture. While the partitioning of other parts of the microarchitecture has been well studied the distribution of the memory unit and the cache has received comparatively little attention. The first proposal consists of a set of cache bank predictors. Eight different predictor designs are compared based on cost and accuracy. The second proposal is the distributed memory unit. The load and store queues are split into smaller queues for distributed disambiguation. The mapping of memory instructions to cache banks is delayed until addresses have been calculated. We show how disambiguation can be implemented efficiently with unordered queues. A bank predictor is used to map instructions that consume memory data near the data origin. We show that this organization significantly reduces both energy usage and latency. The third proposal introduces Dispatch Throttling and Pre-Access Queues. These mechanisms avoid load/store queue overflows that are a result of the late allocation of entries. The fourth proposal introduces Memory Issue Queues, which add functionality to select instructions for execution and re-execution to the memory unit. The fifth proposal introduces Conservative Deadlock Aware Entry Allocation. This mechanism is a deadlock safe issue policy for the Memory Issue Queues. Deadlocks can result from certain queue allocations because entries are allocated out-of-order instead of in-order like in traditional architectures. The sixth proposal is the Early Release of Load Queue Entries. Architectures with weak memory ordering such as Alpha, PowerPC or ARMv7 can take advantage of this mechanism to release load queue entries before the commit stage. Together, these proposals allow significantly smaller and more energy efficient load queues without the need of energy hungry recovery mechanisms and without performance penalties. Finally, we present a detailed study that compares the proposed distributed memory unit to a centralized memory unit and confirms its advantages of reduced energy usage and of improved performance

    Workforce minimization for a mixed-model assembly line in the automotive industry

    Get PDF
    A paced assembly line consisting of several workstations is considered. This line is intended to assemble products of different types. The sequence of products is given. The sequence of technological tasks is common for all types of products. The assignment of tasks to the stations and task sequence on each station are known and cannot be modified, and they do not depend on the product type. Tasks assigned to the same station are performed sequentially. The processing time of a task depends on the number of workers performing this task. Workers are identical and versatile. If a worker is assigned to a task, he/she works on this task from its start till completion. Workers can switch between the stations at the end of each task and the time needed by any worker to move from one station to another one can be neglected. At the line design stage, it is necessary to know how many workers are necessary for the line. To know the response to this question we will consider each possible takt and assign workers to tasks so that the total number of workers is minimized, provided that a given takt time is satisfied. The maximum of minimal numbers of workers for all takts will be considered as the necessary number of workers for the line. Thus, the problem is to assign workers to tasks for a takt. We prove that this problem is NP-hard in the strong sense, we develop an integer linear programming formulation to solve it, and propose conventional and randomized heuristics

    Optimal resource allocation algorithms for cloud computing

    Get PDF
    Cloud computing is emerging as an important platform for business, personal and mobile computing applications. We consider a stochastic model of a cloud computing cluster, where jobs arrive according to a random process and request virtual machines (VMs), which are specified in terms of resources such as CPU, memory and storage space. The jobs are first routed to one of the servers when they arrive and are queued at the servers. Each server then chooses a set of jobs from its queues so that it has enough resources to serve all of them simultaneously. There are many design issues associated with such systems. One important issue is the resource allocation problem, i.e., the design of algorithms for load balancing among servers, and algorithms for scheduling VM configurations. Given our model of a cloud, we define its capacity, i.e., the maximum rates at which jobs can be processed in such a system. An algorithm is said to be throughput-optimal if it can stabilize the system whenever the load is within the capacity region. We show that the widely-used Best-Fit scheduling algorithm is not throughput-optimal. We first consider the problem where the jobs need to be scheduled nonpreemptively on servers. Under the assumptions that the job sizes are known and bounded, we present algorithms that achieve any arbitrary fraction of the capacity region of the cloud. We then relax these assumptions and present a load balancing and scheduling algorithm that is throughput optimal when job sizes are unknown. In this case, job sizes (durations) are modeled as random variables with possibly unbounded support. Delay is a more important metric then throughput optimality in practice. However, analysis of delay of resource allocation algorithms is difficult, so we study the system in the asymptotic limit as the load approaches the boundary of the capacity region. This limit is called the heavy traffic regime. Assuming that the jobs can be preempted once after several time slots, we present delay optimal resource allocation algorithms in the heavy traffic regime. We study delay performance of our algorithms through simulations

    Developing an Intelligent User Manager System for controlling Smart School Network Resources

    Get PDF
    This paper presents an Intelligent User Manager System (UMAS) for controlling access to network resources in a Smart School network. Network resources, especially in a Smart School, are in short supply and relatively expensive to acquire, therefore a control mechanism should be in place so that available resources can be allocated for legitimate usages only. An intelligent mechanism using Fuzzy Logic is deployed for the purpose of knowledge learning in order to process all the user requests accordingly. A decision of granting a network resource request needs to be based on several data sets that represent the current network state, transmission state and users. The system is analysed and designed using the Tropos Methodology. Tropos was chosen because it covers four stages of development. The proposed system was modelled using Fuzzy Logic algorithms for simulation purposes in order to find the relationship between two fuzzy sets with the computed allocated time

    DEVELOPMENT AND APPLICATION OF SCALABLE TWO-DIMENSIONAL HIGH PERFORMANCE LIQUID CHROMATOGRAPHY FOR AROMATIC SELECTIVE SEPARATIONS OF BIOLOGICALLY ACTIVE PHYTOCHEMICALS FROM \u3ci\u3eOPLOPANAX HORRIDUS\u3c/i\u3e

    Get PDF
    The high demand for more efficient purification processes with increased automation and throughput pushes the development of more advanced preparative, pilot, and process scale HPLC instrumentation that is capable of achieving higher purities in a shorter amount of time than are currently achieved using one dimensional separations. A preparative scale 2D HPLC system was designed and reduced to practice in order to demonstrate the capacity for scalability of on-line comprehensive 2D HPLC separations of basic compounds from a challenging natural product extract of Oplopanax horridus. The methodology and instrumentation design herein permits direct method transfer from analytical to preparative scale purifications to alleviate resolution and throughput problems with traditional reversed phase separations. The incorporation of aromatic selective phases (C6-phenyl and biphenyl) increases the resolution of a two dimensional HPLC system that follows a hydrophobic subtraction approach to achieve orthogonality between dimensions. The two dimensional separations herein demonstrate the utility and application of long columns (250 mm) packed with 5 µm fully and superficially porous particles which enable direct scalability of the HPLC separations from analytical to preparative scale and beyond where the instrumentation limitations of 5,000 psi are already factored in. This approach enables equipment already in place in either a lab or a processing environment to be retrofit with a modulation mechanism to incorporate multidimensional chromatography without the capital investment of entirely new instrumentation resulting in a huge cost savings. Utilizing a hyphenated purification approach, a 2D HPLC-ESI-MS system incorporating a C6-phenyl and biphenyl in the first and second dimension respectively was able to successfully resolve more than 90% of detected analytes with a resolution of 1.0 or more in 11 distinct subfractions of the ethyl acetate liquid-liquid extraction layer of the crude methanol extract of O. horridus. The incorporation of pi selective phases such as C6-phenyl and biphenyl offers increased selectivity of aromatic molecules and was demonstrated as a powerful screening phase. Furthermore, using a stronger retaining phase in the second dimension enabled for large (67% of the void volume) loops to be used for transferring eluent from the first to the second dimension without breakthrough
    • …
    corecore