116,932 research outputs found

    Teaching Data Science

    Get PDF
    We describe an introductory data science course, entitled Introduction to Data Science, offered at the University of Illinois at Urbana-Champaign. The course introduced general programming concepts by using the Python programming language with an emphasis on data preparation, processing, and presentation. The course had no prerequisites, and students were not expected to have any programming experience. This introductory course was designed to cover a wide range of topics, from the nature of data, to storage, to visualization, to probability and statistical analysis, to cloud and high performance computing, without becoming overly focused on any one subject. We conclude this article with a discussion of lessons learned and our plans to develop new data science courses.Comment: 10 pages, 4 figures, International Conference on Computational Science (ICCS 2016

    Data Science and Ebola

    Get PDF
    Data Science---Today, everybody and everything produces data. People produce large amounts of data in social networks and in commercial transactions. Medical, corporate, and government databases continue to grow. Sensors continue to get cheaper and are increasingly connected, creating an Internet of Things, and generating even more data. In every discipline, large, diverse, and rich data sets are emerging, from astrophysics, to the life sciences, to the behavioral sciences, to finance and commerce, to the humanities and to the arts. In every discipline people want to organize, analyze, optimize and understand their data to answer questions and to deepen insights. The science that is transforming this ocean of data into a sea of knowledge is called data science. This lecture will discuss how data science has changed the way in which one of the most visible challenges to public health is handled, the 2014 Ebola outbreak in West Africa.Comment: Inaugural lecture Leiden Universit

    Indonesia embraces the Data Science

    Get PDF
    The information era is the time when information is not only largely generated, but also vastly processed in order to extract and generated more information. The complex nature of modern living is represented by the various kind of data. Data can be in the forms of signals, images, texts, or manifolds resembling the horizon of observation. The task of the emerging data sciences are to extract information from the data, for people gain new insights of the complex world. The insights may came from the new way of the data representation, be it a visualizations, mapping, or other. The insights may also come from the implementation of mathematical analysis and or computational processing giving new insights of what the states of the nature represented by the data. Both ways implement the methodologies reducing the dimensionality of the data. The relations between the two functions, representation and analysis are the heart of how information in data is transformed mathematically and computationally into new information. The paper discusses some practices, along with various data coming from the social life in Indonesia to gain new insights about Indonesia in the emerging data sciences. The data sciences in Indonesia has made Indonesian Data Cartograms, Indonesian Celebrity Sentiment Mapping, Ethno-Clustering Maps, social media community detection, and a lot more to come, become possible. All of these are depicted as the exemplifications on how Data Science has become integral part of the technology bringing data closer to people.Comment: Paper presented in South East Asian Mathematical Society (SEAMS) 7th Conference, 10 pages, 7 figure

    Teaching Stats for Data Science

    Get PDF
    “Data science” is a useful catchword for methods and concepts original to the field of statistics, but typically being applied to large, multivariate, observational records. Such datasets call for techniques not often part of an introduction to statistics: modeling, consideration of covariates, sophisticated visualization, and causal reasoning. This article re-imagines introductory statistics as an introduction to data science and proposes a sequence of 10 blocks that together compose a suitable course for extracting information from contemporary data. Recent extensions to the mosaic packages for R together with tools from the “tidyverse” provide a concise and readable notation for wrangling, visualization, model-building, and model interpretation: the fundamental computational tasks of data science
    corecore