2 research outputs found

    The Benefit of Hindsight: Tracing Edge-Cases in Distributed Systems

    Get PDF
    Today's distributed tracing frameworks are ill-equipped to troubleshoot rareedge-case requests. The crux of the problem is a trade-off between specificityand overhead. On the one hand, frameworks can indiscriminately select requeststo trace when they enter the system (head sampling), but this is unlikely tocapture a relevant edge-case trace because the framework cannot know whichrequests will be problematic until after-the-fact. On the other hand,frameworks can trace everything and later keep only the interesting edge-casetraces (tail sampling), but this has high overheads on the traced applicationand enormous data ingestion costs. In this paper we circumvent this trade-off for any edge-case with symptomsthat can be programmatically detected, such as high tail latency, errors, andbottlenecked queues. We propose a lightweight and always-on distributed tracingsystem, Hindsight, which implements a retroactive sampling abstraction: insteadof eagerly ingesting and processing traces, Hindsight lazily retrieves tracedata only after symptoms of a problem are detected. Hindsight is analogous to acar dash-cam that, upon detecting a sudden jolt in momentum, persists the lasthour of footage. Developers using Hindsight receive the exact edge-case tracesthey desire without undue overhead or dependence on luck. Our evaluation showsthat Hindsight scales to millions of requests per second, adds nanosecond-leveloverhead to generate trace data, handles GB/s of data per node, transparentlyintegrates with existing distributed tracing systems, and successfully persistsfull, detailed traces in real-world use cases when edge-case problems aredetected.<br

    Waiting time distribution of a queueing system with postservice activity

    Get PDF
    AbstractIn this paper, we consider a queueing system with postservice activity. During the time when the server is engaged in the postservice activity (wrap-up time), the waiting customer, if any, cannot receive his or her service. This type of queueing system has been used to model automatic call distribution (ACD) systems. We consider the waiting time distribution of the queueing system. Using the Markovian point process that can be expressed by the so-called Markovian arrival process (MAP), we derive the waiting time distribution in terms of the representing matrices of a particular MAP. Then we apply the Baker-Hausdorff lemma to the matrices and derive the conditional waiting time distribution in closed form by exploiting the specific structure of the matrices. As a byproduct, we give an explicit solution of the number of arrivals for the MAP
    corecore