56,156 research outputs found
Preemptive Thread Block Scheduling with Online Structural Runtime Prediction for Concurrent GPGPU Kernels
Recent NVIDIA Graphics Processing Units (GPUs) can execute multiple kernels
concurrently. On these GPUs, the thread block scheduler (TBS) uses the FIFO
policy to schedule their thread blocks. We show that FIFO leaves performance to
chance, resulting in significant loss of performance and fairness. To improve
performance and fairness, we propose use of the preemptive Shortest Remaining
Time First (SRTF) policy instead. Although SRTF requires an estimate of runtime
of GPU kernels, we show that such an estimate of the runtime can be easily
obtained using online profiling and exploiting a simple observation on GPU
kernels' grid structure. Specifically, we propose a novel Structural Runtime
Predictor. Using a simple Staircase model of GPU kernel execution, we show that
the runtime of a kernel can be predicted by profiling only the first few thread
blocks. We evaluate an online predictor based on this model on benchmarks from
ERCBench, and find that it can estimate the actual runtime reasonably well
after the execution of only a single thread block. Next, we design a thread
block scheduler that is both concurrent kernel-aware and uses this predictor.
We implement the SRTF policy and evaluate it on two-program workloads from
ERCBench. SRTF improves STP by 1.18x and ANTT by 2.25x over FIFO. When compared
to MPMax, a state-of-the-art resource allocation policy for concurrent kernels,
SRTF improves STP by 1.16x and ANTT by 1.3x. To improve fairness, we also
propose SRTF/Adaptive which controls resource usage of concurrently executing
kernels to maximize fairness. SRTF/Adaptive improves STP by 1.12x, ANTT by
2.23x and Fairness by 2.95x compared to FIFO. Overall, our implementation of
SRTF achieves system throughput to within 12.64% of Shortest Job First (SJF, an
oracle optimal scheduling policy), bridging 49% of the gap between FIFO and
SJF.Comment: 14 pages, full pre-review version of PACT 2014 poste
TRACEABILITY IN STOCK MANAGEMENT SYSTEMS
This paper presents traceability of a product if we are using a stock management system which uses FIFO or LIFO discharging methods. In the first part there is a little presentation regarding the four types of inputs and outputs and the side effect to thetraceability, stock, management
Gigabit Ethernet Asynchronous Clock Compensation FIFO
Clock compensation for Gigabit Ethernet is necessary because the clock recovered from the 1.25 Gb/s serial data stream has the potential to be 200 ppm slower or faster than the system clock. The serial data is converted to 10-bit parallel data at a 125 MHz rate on a clock recovered from the serial data stream. This recovered data needs to be processed by a system clock that is also running at a nominal rate of 125 MHz, but not synchronous to the recovered clock. To cross clock domains, an asynchronous FIFO (first-in-first-out) is used, with the write pointer (wprt) in the recovered clock domain and the read pointer (rptr) in the system clock domain. Because the clocks are generated from separate sources, there is potential for FIFO overflow or underflow. Clock compensation in Gigabit Ethernet is possible by taking advantage of the protocol data stream features. There are two distinct data streams that occur in Gigabit Ethernet where identical data is transmitted for a period of time. The first is configuration, which happens during auto-negotiation. The second is idle, which occurs at the end of auto-negotiation and between every packet. The identical data in the FIFO can be repeated by decrementing the read pointer, thus compensating for a FIFO that is draining too fast. The identical data in the FIFO can also be skipped by incrementing the read pointer, which compensates for a FIFO draining too slowly. The unique and novel features of this FIFO are that it works in both the idle stream and the configuration streams. The increment or decrement of the read pointer is different in the idle and compensation streams to preserve disparity. Another unique feature is that the read pointer to write pointer difference range changes between compensation and idle to minimize FIFO latency during packet transmission
A Design Methodology for Space-Time Adapter
This paper presents a solution to efficiently explore the design space of
communication adapters. In most digital signal processing (DSP) applications,
the overall architecture of the system is significantly affected by
communication architecture, so the designers need specifically optimized
adapters. By explicitly modeling these communications within an effective
graph-theoretic model and analysis framework, we automatically generate an
optimized architecture, named Space-Time AdapteR (STAR). Our design flow inputs
a C description of Input/Output data scheduling, and user requirements
(throughput, latency, parallelism...), and formalizes communication constraints
through a Resource Constraints Graph (RCG). The RCG properties enable an
efficient architecture space exploration in order to synthesize a STAR
component. The proposed approach has been tested to design an industrial data
mixing block example: an Ultra-Wideband interleaver.Comment: ISBN : 978-1-59593-606-
A Priority-based Fair Queuing (PFQ) Model for Wireless Healthcare System
Healthcare is a very active research area, primarily due to the increase in the elderly population that leads to increasing number of emergency situations that require urgent actions. In recent years some of wireless networked medical devices were equipped with different sensors to measure and report on vital signs of patient remotely. The most important sensors are Heart Beat Rate (ECG), Pressure and Glucose sensors. However, the strict requirements and real-time nature of medical applications dictate the extreme importance and need for appropriate Quality of Service (QoS), fast and accurate delivery of a patient’s measurements in reliable e-Health ecosystem.
As the elderly age and older adult population is increasing (65 years and above) due to the advancement in medicine and medical care in the last two decades; high QoS and reliable e-health ecosystem has become a major challenge in Healthcare especially for patients who require continuous monitoring and attention. Nevertheless, predictions have indicated that elderly population will be approximately 2 billion in developing countries by 2050 where availability of medical staff shall be unable to cope with this growth and emergency cases that need immediate intervention. On the other side, limitations in communication networks capacity, congestions and the humongous increase of devices, applications and IOT using the available communication networks add extra layer of challenges on E-health ecosystem such as time constraints, quality of measurements and signals reaching healthcare centres.
Hence this research has tackled the delay and jitter parameters in E-health M2M wireless communication and succeeded in reducing them in comparison to current available models. The novelty of this research has succeeded in developing a new Priority Queuing model ‘’Priority Based-Fair Queuing’’ (PFQ) where a new priority level and concept of ‘’Patient’s Health Record’’ (PHR) has been developed and
integrated with the Priority Parameters (PP) values of each sensor to add a second level of priority. The results and data analysis performed on the PFQ model under different scenarios simulating real M2M E-health environment have revealed that the PFQ has outperformed the results obtained from simulating the widely used current models such as First in First Out (FIFO) and Weight Fair Queuing (WFQ).
PFQ model has improved transmission of ECG sensor data by decreasing delay and jitter in emergency cases by 83.32% and 75.88% respectively in comparison to FIFO and 46.65% and 60.13% with respect to WFQ model. Similarly, in pressure sensor the improvements were 82.41% and 71.5% and 68.43% and 73.36% in comparison to FIFO and WFQ respectively. Data transmission were also improved in the Glucose sensor by 80.85% and 64.7% and 92.1% and 83.17% in comparison to FIFO and WFQ respectively. However, non-emergency cases data transmission using PFQ model was negatively impacted and scored higher rates than FIFO and WFQ since PFQ tends to give higher priority to emergency cases.
Thus, a derivative from the PFQ model has been developed to create a new version namely “Priority Based-Fair Queuing-Tolerated Delay” (PFQ-TD) to balance the data transmission between emergency and non-emergency cases where tolerated delay in emergency cases has been considered. PFQ-TD has succeeded in balancing fairly this issue and reducing the total average delay and jitter of emergency and non-emergency cases in all sensors and keep them within the acceptable allowable standards. PFQ-TD has improved the overall average delay and jitter in emergency and non-emergency cases among all sensors by 41% and 84% respectively in comparison to PFQ model
- …
