113 research outputs found

    Trustworthy Experimentation Under Telemetry Loss

    Full text link
    Failure to accurately measure the outcomes of an experiment can lead to bias and incorrect conclusions. Online controlled experiments (aka AB tests) are increasingly being used to make decisions to improve websites as well as mobile and desktop applications. We argue that loss of telemetry data (during upload or post-processing) can skew the results of experiments, leading to loss of statistical power and inaccurate or erroneous conclusions. By systematically investigating the causes of telemetry loss, we argue that it is not practical to entirely eliminate it. Consequently, experimentation systems need to be robust to its effects. Furthermore, we note that it is nontrivial to measure the absolute level of telemetry loss in an experimentation system. In this paper, we take a top-down approach towards solving this problem. We motivate the impact of loss qualitatively using experiments in real applications deployed at scale, and formalize the problem by presenting a theoretical breakdown of the bias introduced by loss. Based on this foundation, we present a general framework for quantitatively evaluating the impact of telemetry loss, and present two solutions to measure the absolute levels of loss. This framework is used by well-known applications at Microsoft, with millions of users and billions of sessions. These general principles can be adopted by any application to improve the overall trustworthiness of experimentation and data-driven decision making.Comment: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, October 201

    PMH41 REDUCTION IN LONG-ACTING BENZODIAZEPINE THERAPY AND ASSOCIATED FRACTURES IN ELDERLY MEDICAID PATIENTS

    Get PDF

    PAR19: IMPACT OF THE ADDITION OF SALMETEROL TO THE TREATMENT OF ASTHMA PATIENTS IN A MEDICAID FEE-FOR-SERVICE POPULATION

    Get PDF

    Analysis of Problem Tokens to Rank Factors Impacting Quality in VoIP Applications

    Full text link
    User-perceived quality-of-experience (QoE) in internet telephony systems is commonly evaluated using subjective ratings computed as a Mean Opinion Score (MOS). In such systems, while user MOS can be tracked on an ongoing basis, it does not give insight into which factors of a call induced any perceived degradation in QoE -- it does not tell us what caused a user to have a sub-optimal experience. For effective planning of product improvements, we are interested in understanding the impact of each of these degrading factors, allowing the estimation of the return (i.e., the improvement in user QoE) for a given investment. To obtain such insights, we advocate the use of an end-of-call "problem token questionnaire" (PTQ) which probes the user about common call quality issues (e.g., distorted audio or frozen video) which they may have experienced. In this paper, we show the efficacy of this questionnaire using data gathered from over 700,000 end-of-call surveys gathered from Skype (a large commercial VoIP application). We present a method to rank call quality and reliability issues and address the challenge of isolating independent factors impacting the QoE. Finally, we present representative examples of how these problem tokens have proven to be useful in practice

    Health Literacy Based Communication by Illinois Pharmacists

    Get PDF
    Objectives: Health literacy has received attention as an important issue for pharmacists to consider when interacting with patients. Yet, there is little information about methods pharmacists use to communicate with patients and their extent of use of health literacy based interventions during patient interactions. The purpose of this study was to examine methods of communication and types of health literacy based interventions that practicing pharmacists use in Illinois. Methods: A survey instrument addressing the study purpose was designed along with other items that were part of a larger study. Eleven items in the survey referred to pharmacist-patient communication. The instrument was pilot tested before administering to a random sample of 1457 pharmacists from the Illinois Pharmacists Association. Data were primarily collected via a mailed survey using Dillman’s five step total design method (TDM). Two reminder letters were mailed at two week intervals to non-respondents. Results: Usable responses were obtained from 701 respondents (48.1% response rate). Using simple words (96%) and asking patients open-ended questions to determine comprehension (85%) were the most frequent methods that pharmacists used to communicate with patients. Only 18% of respondents always asked patients to repeat medication instructions to confirm understanding. The various recommended types of health literacy interventions were “always” performed by only 8 to 33% of the respondents. More than 50% of respondents indicated that they rarely or never had access to an interpreter (51%), or employed bilingual pharmacists (59%). Only 11% of pharmacists said that they rarely/never pay attention to nonverbal cues that may suggest low health literacy. Conclusions: Pharmacists infrequently use action oriented health literacy interventions such as using visual aids, having interpreter access, medication calendars, etc. Additional training on health literacy, its scope, and related interventions coupled with system redesign and compensation for time spent counseling are essential to encourage health literacy tailored communication with patients.   Type: Original Researc

    Improving Meeting Inclusiveness using Speech Interruption Analysis

    Full text link
    Meetings are a pervasive method of communication within all types of companies and organizations, and using remote collaboration systems to conduct meetings has increased dramatically since the COVID-19 pandemic. However, not all meetings are inclusive, especially in terms of the participation rates among attendees. In a recent large-scale survey conducted at Microsoft, the top suggestion given by meeting participants for improving inclusiveness is to improve the ability of remote participants to interrupt and acquire the floor during meetings. We show that the use of the virtual raise hand (VRH) feature can lead to an increase in predicted meeting inclusiveness at Microsoft. One challenge is that VRH is used in less than 1% of all meetings. In order to drive adoption of its usage to improve inclusiveness (and participation), we present a machine learning-based system that predicts when a meeting participant attempts to obtain the floor, but fails to interrupt (termed a `failed interruption'). This prediction can be used to nudge the user to raise their virtual hand within the meeting. We believe this is the first failed speech interruption detector, and the performance on a realistic test set has an area under curve (AUC) of 0.95 with a true positive rate (TPR) of 50% at a false positive rate (FPR) of <1%. To our knowledge, this is also the first dataset of interruption categories (including the failed interruption category) for remote meetings. Finally, we believe this is the first such system designed to improve meeting inclusiveness through speech interruption analysis and active intervention

    Psychometric Assessment of the PPDG: Utilizing Cronbach’s Alpha as a Means of Reliability

    Get PDF
    Introduction: Since the development of the 10 item Purdue Pharmacist Directive Guidance (PPDG) Scale several studies of the psychometric properties of the PPDG have been conducted. Although Cronbach’s alpha was calculated as a means of internal consistency reliability, a demonstration of the mean centering of the individual items from the instrument were not explored.Objectives: This study focused on investigating the mean stabilization of items within the PPDG as they pertain to Cronbach’s reliability coefficient calculation.Methods:Using item analysis procedures in SPSS, the mean stability of items within the general factor of directive guidance and subscales of instruction and feedback and goal setting were examined for the PPDG.Results:Mean stability scores for entire PPDG scale and the subscales of instruction and feedback and goal setting were strong. Also, corrected item-total correlations and Cronbach’s alphas following item deletion were good for the overall PPDG scale and the subscales.Conclusions: The results provide evidence to enhance understanding of the psychometric stability of the PPDG scale and its subscales
    • …
    corecore