714 research outputs found

    A Project Based Approach to Statistics and Data Science

    Full text link
    In an increasingly data-driven world, facility with statistics is more important than ever for our students. At institutions without a statistician, it often falls to the mathematics faculty to teach statistics courses. This paper presents a model that a mathematician asked to teach statistics can follow. This model entails connecting with faculty from numerous departments on campus to develop a list of topics, building a repository of real-world datasets from these faculty, and creating projects where students interface with these datasets to write lab reports aimed at consumers of statistics in other disciplines. The end result is students who are well prepared for interdisciplinary research, who are accustomed to coping with the idiosyncrasies of real data, and who have sharpened their technical writing and speaking skills

    The Promise and Peril of Big Data

    Get PDF
    The Promise and Peril of Big Data explores the implications of inferential technologies used to analyze massive amounts of data and the ways in which these techniques can positively affect business, medicine, and government. The report is the result of the Eighteenth Annual Roundtable on Information Technology

    Patient-tailored prioritization for a pediatric care decision support system through machine learning

    Get PDF
    Objective Over 8 years, we have developed an innovative computer decision support system that improves appropriate delivery of pediatric screening and care. This system employs a guidelines evaluation engine using data from the electronic health record (EHR) and input from patients and caregivers. Because guideline recommendations typically exceed the scope of one visit, the engine uses a static prioritization scheme to select recommendations. Here we extend an earlier idea to create patient-tailored prioritization. Materials and methods We used Bayesian structure learning to build networks of association among previously collected data from our decision support system. Using area under the receiver-operating characteristic curve (AUC) as a measure of discriminability (a sine qua non for expected value calculations needed for prioritization), we performed a structural analysis of variables with high AUC on a test set. Our source data included 177 variables for 29 402 patients. Results The method produced a network model containing 78 screening questions and anticipatory guidance (107 variables total). Average AUC was 0.65, which is sufficient for prioritization depending on factors such as population prevalence. Structure analysis of seven highly predictive variables reveals both face-validity (related nodes are connected) and non-intuitive relationships. Discussion We demonstrate the ability of a Bayesian structure learning method to ‘phenotype the population’ seen in our primary care pediatric clinics. The resulting network can be used to produce patient-tailored posterior probabilities that can be used to prioritize content based on the patient's current circumstances. Conclusions This study demonstrates the feasibility of EHR-driven population phenotyping for patient-tailored prioritization of pediatric preventive care services

    Vital data : writing and circulating data in non-profits.

    Get PDF
    This dissertation presents the results of an ethnographically-informed workplace observation study of a single non-profit referred to throughout as “the Metro Data Coalition” (MDC). It begins with an overview of the organization, its institutional history, the technical and technological scenes of composing, and the demands placed on the writing process by each of these variables. It considers usability studies, activity theory, and rhetorical ecologies in coming to terms with how MDC writers shape the numerical data they work with daily. The latter half of the dissertation considers how MDC writers approach their work as “storytellers,” a self-concept that is threaded throughout their writing process, and the ways in which MDC team members and those of their parent non-profit—the City-Community Partnership—shape a circulation process in a bid to measure the MDC’s rhetorical “impact.” The dissertation is divided into six parts. The introduction and Chapter 1 serve to set the scene of the MDC, their organization, their purpose, and their writing processes. I argue here that their organizational ethos is imposed by a range of structural and historical forces, and ultimately runs into conflict with their mission statement. In Chapter 2, I zoom in on the technologically-mediated data visual composing process and make a case for a vision of distributed creativity that suits technical writing scholarship. In Chapter 3, I focus on the organization’s and individual team members’ approaches to “story” and “storytelling,” and argue that “storytelling” is itself an action that is distributed across a perceived ecology of MDC work and circulation, and that the goal is a sense of “stickiness” that is ultimately fraught in our present, hyper-digitized and ecological age. Chapter 4 takes up the issue of “mission impact,” and the ways in which ecologies of work are shaped and re-shaped in a bid to prove rhetorical success of MDC work. Here, I argue that a story’s “stickiness” cannot be read by one-to-one uptake of arguments, but instead by evidence of re-telling in other organizations. In the conclusion, I emphasize external organizations and the way MDC data has been approached, ultimately suggesting that the technical, quantitative writing the organization engages with is unsuited to the rapidity with which quantitative information can be shaped and re-shaped to align with previously-held, culturally infused “stories.” Ultimately, this project is designed to provide a set of workable heuristics for understanding how quantitative information can be shaped and deployed in technical and professional writing scenarios. It is a study of the “life” of data and the many mutations that happen within that “lifecycle.” To get there, however, it is necessary to engage with real-world writers doing heavily quantitatively-informed work, and to come to terms with the non-numerical, “subjective” forces that shape how we approach “data” in the 21st Century

    Design and Architecture of an Ontology-driven Dialogue System for HPV Vaccine Counseling

    Get PDF
    Speech and conversational technologies are increasingly being used by consumers, with the inevitability that one day they will be integrated in health care. Where this technology could be of service is in patient-provider communication, specifically for communicating the risks and benefits of vaccines. Human papillomavirus (HPV) vaccine, in particular, is a vaccine that inoculates individuals from certain HPV viruses responsible for adulthood cancers - cervical, head and neck cancers, etc. My research focuses on the architecture and development of speech-enabled conversational agent that relies on series of consumer-centric health ontologies and the technology that utilizes these ontologies. Ontologies are computable artifacts that encode and structure domain knowledge that can be utilized by machines to provide high level capabilities, such as reasoning and sharing information. I will focus the agent’s impact on the HPV vaccine domain to observe if users would respond favorably towards conversational agents and the possible impact of the agent on their beliefs of the HPV vaccine. The approach of this study involves a multi-tier structure. The first tier is the domain knowledge base, the second is the application interaction design tier, and the third is the feasibility assessment of the participants. The research in this study proposes the following questions: Can ontologies support the system architecture for a spoken conversational agent for HPV vaccine counseling? How would prospective users’ perception towards an agent and towards the HPV vaccine be impacted after using conversational agent for HPV vaccine education? The outcome of this study is a comprehensive assessment of a system architecture of a conversational agent for patient-centric HPV vaccine counseling. Each layer of the agent architecture is regulated through domain and application ontologies, and supported by the various ontology-driven software components that I developed to compose the agent architecture. Also discussed in this work, I present preliminary evidence of high usability of the agent and improvement of the users’ health beliefs toward the HPV vaccine. All in all, I introduce a comprehensive and feasible model for the design and development of an open-sourced, ontology-driven conversational agent for any health consumer domain, and corroborate the viability of a conversational agent as a health intervention tool

    The Design of Interactive Visualizations and Analytics for Public Health Data

    Get PDF
    Public health data plays a critical role in ensuring the health of the populace. Professionals use data as they engage in efforts to improve and protect the health of communities. For the public, data influences their ability to make health-related decisions. Health literacy, which is the ability of an individual to access, understand, and apply health data, is a key determinant of health. At present, people seeking to use public health data are confronted with a myriad of challenges some of which relate to the nature and structure of the data. Interactive visualizations are a category of computational tools that can support individuals as they seek to use public health data. With interactive visualizations, individuals can access underlying data, change how data is represented, manipulate various visual elements, and in certain tools control and perform analytic tasks. That being said, currently, in public health, simple visualizations, which fail to effectively support the exploration of large sets of data, are predominantly used. The goal of this dissertation is to demonstrate the benefit of sophisticated interactive visualizations and analytics. As improperly designed visualizations can negatively impact users’ discourse with data, there is a need for frameworks to help designers think systematically about design issues. Furthermore, there is a need to demonstrate how such frameworks can be utilized. This dissertation includes a process by which designers can create health visualizations. Using this process, five novel visualizations were designed to facilitate making sense of public health data. Three studies were conducted with the visualizations. The first study explores how computational models can be used to make sense of the discourse of health on a social media platform. The second study investigates the use of instructional materials to improve visualization literacy. Visualization literacy is important because even when visualizations are designed properly, there still exists a gap between how a tool works and users’ perceptions of how the tool should work. The last study examines the efficacy of visualizations to improve health literacy. Overall then, this dissertation provides designers with a deeper understanding of how to systematically design health visualizations

    The Dynamic Effect of Financial Sector Development in Stimulating the Gross National Savings of Djibouti

    Get PDF
    Savings are important determinants of wealth. At the macroeconomic level, governments attach importance to saving money in order to make new investments, produce new capital goods, and sustain economic growth. However, due to the high level of internal and external debt in Djibouti, it is nearly impossible for the country to achieve domestic savings. Hereby, the major aim of this study is to examine the dynamic effect of financial sector development in stimulating the gross national saving of Djibouti from the period 1987 to 2021. The paper considered numerous indicators as measurements of the financial sector development including FDI inflows, domestic loans to the private sector, central bank assets to GDP, and money supply. To proceed with the analysis, Non-Linear Autoregressive Distributed Lag (NARDL) was performed and according to the model, the findings highlighted that the Djiboutian financial industry is still in its early development and has not yet made a substantial contribution to boosting the country's national savings. Nevertheless, the gross national saving of Djibouti was still positively prompted by significant components of the financial sector development, such as the positive shocks of FDI inflows and both the negative shocks of central bank assets and money supply. While both the positive and negative shocks of the credit offered to the private sector were uncovered to diminish the national savings in the long run. In conclusion, the current research will help governments and policymakers understand the best ways to use the financial sector to raise gross national savings. It will also present evidence of how to implement long-term initiatives that can lower public debt and encourage savings. Not to mention, the article provides information on the value of long-term investments

    Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications

    Full text link
    There has been a growing interest in model-agnostic methods that can make deep learning models more transparent and explainable to a user. Some researchers recently argued that for a machine to achieve a certain degree of human-level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing body of literature on counterfactuals and causability for explainable artificial intelligence. We performed an LDA topic modelling analysis under a PRISMA framework to find the most relevant literature articles. This analysis resulted in a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications in real-world data. This research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Our findings suggest that the explanations derived from major algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous or even biased explanations. This paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable artificial intelligence
    corecore