9,133,466 research outputs found

    Data Science and Ebola

    Get PDF
    Data Science---Today, everybody and everything produces data. People produce large amounts of data in social networks and in commercial transactions. Medical, corporate, and government databases continue to grow. Sensors continue to get cheaper and are increasingly connected, creating an Internet of Things, and generating even more data. In every discipline, large, diverse, and rich data sets are emerging, from astrophysics, to the life sciences, to the behavioral sciences, to finance and commerce, to the humanities and to the arts. In every discipline people want to organize, analyze, optimize and understand their data to answer questions and to deepen insights. The science that is transforming this ocean of data into a sea of knowledge is called data science. This lecture will discuss how data science has changed the way in which one of the most visible challenges to public health is handled, the 2014 Ebola outbreak in West Africa.Comment: Inaugural lecture Leiden Universit

    Mechanism Design for Data Science

    Full text link
    Good economic mechanisms depend on the preferences of participants in the mechanism. For example, the revenue-optimal auction for selling an item is parameterized by a reserve price, and the appropriate reserve price depends on how much the bidders are willing to pay. A mechanism designer can potentially learn about the participants' preferences by observing historical data from the mechanism; the designer could then update the mechanism in response to learned preferences to improve its performance. The challenge of such an approach is that the data corresponds to the actions of the participants and not their preferences. Preferences can potentially be inferred from actions but the degree of inference possible depends on the mechanism. In the optimal auction example, it is impossible to learn anything about preferences of bidders who are not willing to pay the reserve price. These bidders will not cast bids in the auction and, from historical bid data, the auctioneer could never learn that lowering the reserve price would give a higher revenue (even if it would). To address this impossibility, the auctioneer could sacrifice revenue optimality in the initial auction to obtain better inference properties so that the auction's parameters can be adapted to changing preferences in the future. This paper develops the theory for optimal mechanism design subject to good inferability

    Teaching Stats for Data Science

    Get PDF
    “Data science” is a useful catchword for methods and concepts original to the field of statistics, but typically being applied to large, multivariate, observational records. Such datasets call for techniques not often part of an introduction to statistics: modeling, consideration of covariates, sophisticated visualization, and causal reasoning. This article re-imagines introductory statistics as an introduction to data science and proposes a sequence of 10 blocks that together compose a suitable course for extracting information from contemporary data. Recent extensions to the mosaic packages for R together with tools from the “tidyverse” provide a concise and readable notation for wrangling, visualization, model-building, and model interpretation: the fundamental computational tasks of data science

    Data Science and Big Data in Energy Forecasting

    Get PDF
    This editorial summarizes the performance of the special issue entitled Data Science and Big Data in Energy Forecasting, which was published at MDPI’s Energies journal. The special issue took place in 2017 and accepted a total of 13 papers from 7 different countries. Electrical, solar and wind energy forecasting were the most analyzed topics, introducing new methods with applications of utmost relevance.Ministerio de Competitividad TIN2014-55894-C2-RMinisterio de Competitividad TIN2017-88209-C2-

    Using Data in Undergraduate Science Classrooms

    Get PDF
    Provides pedagogical insight concerning the skill of using data The resource being annotated is: http://www.dlese.org/dds/catalog_DATA-CLASS-000-000-000-007.htm

    Managing Research Data in Big Science

    Get PDF
    The project which led to this report was funded by JISC in 2010--2011 as part of its 'Managing Research Data' programme, to examine the way in which Big Science data is managed, and produce any recommendations which may be appropriate. Big science data is different: it comes in large volumes, and it is shared and exploited in ways which may differ from other disciplines. This project has explored these differences using as a case-study Gravitational Wave data generated by the LSC, and has produced recommendations intended to be useful variously to JISC, the funding council (STFC) and the LSC community. In Sect. 1 we define what we mean by 'big science', describe the overall data culture there, laying stress on how it necessarily or contingently differs from other disciplines. In Sect. 2 we discuss the benefits of a formal data-preservation strategy, and the cases for open data and for well-preserved data that follow from that. This leads to our recommendations that, in essence, funders should adopt rather light-touch prescriptions regarding data preservation planning: normal data management practice, in the areas under study, corresponds to notably good practice in most other areas, so that the only change we suggest is to make this planning more formal, which makes it more easily auditable, and more amenable to constructive criticism. In Sect. 3 we briefly discuss the LIGO data management plan, and pull together whatever information is available on the estimation of digital preservation costs. The report is informed, throughout, by the OAIS reference model for an open archive
    corecore