208 research outputs found

    A Graph based approach for Co-scheduling jobs on Multi-core computers

    Get PDF
    In a multicore processor system, running multiple applications on different cores in the same chip could cause resource contention, which leads to performance degradation. Recent studies have shown that job co-scheduling can effectively reduce the contention. However, most existing co-schedulers do not aim to find the optimal co-scheduling solution. It is very useful to know the optimal co-scheduling performance so that the system and scheduler designers can know how much room there is for further performance improvement. Moreover, most co-schedulers only consider serial jobs, and do not take parallel jobs into account. This paper aims to tackle the above issues. In this paper, we first present a new approach to modelling the problem of co-scheduling both parallel and serial jobs. Further, a method is developed to find the optimal co-scheduling solutions. The simulation results show that compare to the method that only considers serial jobs, our developed method to co-schedule parallel jobs can improve the performance by 31% on average

    Developing graph-based co-scheduling algorithms on multicore computers

    Get PDF
    It is common that multiple cores reside on the same chip and share the on-chip cache. As a result, resource sharing can cause performance degradation of co-running jobs.Job co-scheduling is a technique that can effectively alleviate this contention and many co-schedulers have been reported in related literature. Most solutions however do not aim to find the optimal co-scheduling solution. Being able to determine the optimal solution is critical for evaluating co-scheduling systems. Moreover, most co-schedulers only consider serial jobs, and there often exist both parallel and serial jobs in real-world systems. In this paper a graph-based method is developed to find the optimal co-scheduling solution for serial jobs; the method is then extended to incorporate parallel jobs, including multi-process, and multithreaded parallel jobs. A number of optimization measures are also developed to accelerate the solving process. Moreover, a flexible approximation technique is proposed to strike a balance between the solving speed and the solution quality. Extensive experiments are conducted to evaluate the effectiveness of the proposed co-scheduling algorithms. The results show that the proposed algorithms can find the optimal co-scheduling solution for both serial and parallel jobs. The proposed approximation technique is also shown to be flexible in the sense that we can control the solving speed by setting the requirement for the solution quality

    PENGARUH DUKUNGAN SOSIAL ORANG TUA DAN TEMAN SEBAYA TERHADAP KEPERCAYAAN DIRI REMAJA

    Get PDF
    Banda Ace

    WolfGraph : the edge-centric graph processing on GPU

    Get PDF
    There is the significant interest nowadays in developing the frameworks for parallelizing the processing of large graphs such as social networks, web graphs, etc. The work has been proposed to parallelize the graph processing on clusters (distributed memory), multicore machines (shared memory) and GPU devices. Most existing research on GPU-based graph processing employs the vertex-centric processing model and the Compressed Sparse Row (CSR) form to store and process a graph. However, they suffer from irregular memory access and load imbalance in GPU, which hampers the full exploitation of GPU performance. In this paper, we present WolfGraph, a GPU-based graph processing framework that addresses the above problems. WolfGraph adopts the edge-centric processing, which iterates over the edges rather than vertices. The data structure and graph partition in WolfGraph are carefully crafted so as to minimize the graph pre-processing and allow the coalesced memory access. WolfGraph fully utilizes the GPU power by processing all edges in parallel. We also develop a new method, called Concatenated Edge List (CEL), to process a graph that is bigger than the global memory of GPU. WolfGraph allows the users to define their own graph-processing methods and plug them into the WolfGraph framework. Our experiments show that WolfGraph achieves 7-8x speedup over GraphChi and X-Stream when processing large graphs, and it also offers 65% performance improvement over the existing GPU-based, vertex-centric graph processing frameworks, such as Gunrock

    Optical polarization rogue waves from supercontinuum generation in zero dispersion fiber pumped by dissipative soliton

    Get PDF
    Optical rogue waves emerge in nonlinear optical systems with extremely large amplitudes, and leave without a trace. In this work, we reveal the emergence of optical polarization rogue waves in supercontinuum generation from a zero-dispersion fiber, pumped by a dissipative soliton laser. Flat spectral broadening is achieved by modulation instability, followed by cascaded four-wave-mixing. In this process, we identify the emergence of optical polarization rogue waves, based on the probability density function of the relative distance among polarization states. Experimental results show that optical polarization rogue waves originate from vector multi-wave-mixing. Besides, we observe double peaks, and even triple peaks in the histogram of the state of polarization. This is a new and intriguing property, never observed so far in optical rogue waves, for example those emerging in the statistics of pulse intensities. Our polarization domain statistical analysis provides a new insight into the still debated topic of the mechanism for rogue wave generation in optical supercontinuum

    Developing a pattern discovery method in time series data and its GPU acceleration

    Get PDF
    The Dynamic Time Warping (DTW) algorithm is widely used in finding the global alignment of time series. Many time series data mining and analytical problems can be solved by the DTW algorithm. However, using the DTW algorithm to find similar subsequences is computationally expensive or unable to perform accurate analysis. Hence, in the literature, the parallelisation technique is used to speed up the DTW algorithm. However, due to the nature of DTW algorithm, parallelising this algorithm remains an open challenge. In this paper, we first propose a novel method that finds the similar local subsequence. Our algorithm first searches for the possible start positions of subsequence, and then finds the best-matching alignment from these positions. Moreover, we parallelise the proposed algorithm on GPUs using CUDA and further propose an optimisation technique to improve the performance of our parallelization implementation on GPU. We conducted the extensive experiments to evaluate the proposed method. Experimental results demonstrate that the proposed algorithm is able to discover time series subsequences efficiently and that the proposed GPU-based parallelization technique can further speedup the processing

    Developing Graph-Based Co-Scheduling Algorithms on Multicore Computers

    Full text link
    • …
    corecore