14 research outputs found

    Dates and Times Made Easy with lubridate

    Get PDF
    This paper presents the lubridate package for R, which facilitates working with dates and times. Date-times create various technical problems for the data analyst. The paper highlights these problems and offers practical advice on how to solve them using lubridate. The paper also introduces a conceptual framework for arithmetic with date-times in R.

    Dates and Times Made Easy with lubridate

    Get PDF
    This paper presents the lubridate package for R, which facilitates working with dates and times. Date-times create various technical problems for the data analyst. The paper highlights these problems and offers practical advice on how to solve them using lubridate. The paper also introduces a conceptual framework for arithmetic with date-times in R

    Tools and theory to improve data analysis

    Get PDF
    This thesis proposes a scientific model to explain the data analysis process. I argue that data analysis is primarily a procedure to build un- derstanding and as such, it dovetails with the cognitive processes of the human mind. Data analysis tasks closely resemble the cognitive process known as sensemaking. I demonstrate how data analysis is a sensemaking task adapted to use quantitative data. This identification highlights a uni- versal structure within data analysis activities and provides a foundation for a theory of data analysis. The model identifies two competing chal- lenges within data analysis: the need to make sense of information that we cannot know and the need to make sense of information that we can- not attend to. Classical statistics provides solutions to the first challenge, but has little to say about the second. However, managing attention is the primary obstacle when analyzing big data. I introduce three tools for managing attention during data analysis. Each tool is built upon a different method for managing attention. ggsubplot creates embedded plots, which transform data into a format that can be easily processed by the human mind. lubridate helps users automate sensemaking out- side of the mind by improving the way computers handle date-time data. Visual Inference Tools develop expertise in young statisticians that can later be used to efficiently direct attention. The insights of this thesis are especially helpful for consultants, applied statisticians, and teachers of data analysis

    Windborne long-distance migration of malaria mosquitoes in the Sahel

    Get PDF
    Over the past two decades efforts to control malaria have halved the number of cases globally, yet burdens remain high in much of Africa and the elimination of malaria has not been achieved even in areas where extreme reductions have been sustained, such as South Africa1,2. Studies seeking to understand the paradoxical persistence of malaria in areas in which surface water is absent for 3–8 months of the year have suggested that some species of Anopheles mosquito use long-distance migration3. Here we confirm this hypothesis through aerial sampling of mosquitoes at 40–290 m above ground level and provide—to our knowledge—the first evidence of windborne migration of African malaria vectors, and consequently of the pathogens that they transmit. Ten species, including the primary malaria vector Anopheles coluzzii, were identified among 235 anopheline mosquitoes that were captured during 617 nocturnal aerial collections in the Sahel of Mali. Notably, females accounted for more than 80% of all of the mosquitoes that we collected. Of these, 90% had taken a blood meal before their migration, which implies that pathogens are probably transported over long distances by migrating females. The likelihood of capturing Anopheles species increased with altitude (the height of the sampling panel above ground level) and during the wet seasons, but variation between years and localities was minimal. Simulated trajectories of mosquito flights indicated that there would be mean nightly displacements of up to 300 km for 9-h flight durations. Annually, the estimated numbers of mosquitoes at altitude that cross a 100-km line perpendicular to the prevailing wind direction included 81,000 Anopheles gambiae sensu stricto, 6 million A. coluzzii and 44 million Anopheles squamosus. These results provide compelling evidence that millions of malaria vectors that have previously fed on blood frequently migrate over hundreds of kilometres, and thus almost certainly spread malaria over these distances. The successful elimination of malaria may therefore depend on whether the sources of migrant vectors can be identified and controlled

    Graphics for Statistics and Data Analysis with R

    Get PDF
    Abstracts not available for BookReview

    A Cognitive Interpretation of Data Analysis

    No full text
    Abstract This paper proposes a scientific model to explain the data analysis process. We argue that data analysis is primarily a procedure to build understanding and as such, it dovetails with the cognitive processes of the human mind. Data analysis tasks closely resemble the cognitive process known as sensemaking. We demonstrate how data analysis is a sensemaking task adapted to use quantitative data. This identification highlights a universal structure within data analysis activities and provides a foundation for a theory of data analysis. The competing tensions of cognitive compatibility and scientific rigor create a series of problems that characterize the data analysis process. These problems form a useful organizing model for the data analysis task while allowing methods to remain flexible and situation dependent. The insights of this model are especially helpful for consultants, applied statisticians, and teachers of data analysis

    Visualizing Complex Data With Embedded Plots

    No full text
    <div><p>This article describes a class of graphs, embedded plots, that are particularly useful for analyzing large and complex datasets. Embedded plots organize a collection of graphs into a larger graphic, which can display more complex relationships than would otherwise be possible. This arrangement provides additional axes, prevents overplotting, and allows for multiple levels of visual summarization. Embedded plots also preprocess complex data into a form suitable for the human cognitive system, which can facilitate comprehension. We illustrate the usefulness of embedded plots with a case study, discuss the practical and cognitive advantages of embedded plots, and demonstrate how to implement embedded plots as a general class within visualization software, something currently unavailable. This article has supplementary material online.</p></div
    corecore