41 research outputs found

    Defining data literacy: An empirical study of data literacy dimensions

    Get PDF
    Data literacy has become a core component in higher education as it encompasses a range of data skills and the knowledge necessary to deal with data, which are critical in our social and work lives in the advent of big data. Multiple perspectives to define data literacy have emerged from multiple disciplines, including information science, computer science, business, and education. Along with this, there have been efforts to develop a data literacy competency model to enhance our understanding of the required skills for data literacy. But each model has a different focus, context, and target audience – for instance, some efforts are intended to address the data literacy needs of citizens in today’s society because they see data literacy as a life skill, whereas others are intended to define data literacy as one of the essential skills required to perform tasks in a specific career. Although the importance of data literacy is increasingly recognized, there is no consensus about the definition of data literacy. Further, the constituent dimensions of data literacy remain disputed. As such, this presentation will illustrate the preliminary results of a bibliometric analysis of data literacy literatures in recent ten years. Through citation analysis and topic analysis, this study aims to identify the central dimensions of data literacy and develop an integrated model for data literacy

    Enhancing Decision-making in Smart and Connected Communities with Digital Traces

    Get PDF
    The ubiquitous use of information communication technologies (ICTs) enables generation of digital traces associated with human behaviors at unprecedented breadth, depth, and scale. Large-scale digital traces provide the potential to understand population behaviors automatically, including the characterization of how individuals interact with the physical environment. As a result, the use of digital traces generated by humans might mitigate some of the challenges associated to the use of surveys to understand human behaviors such as, high cost in collecting information, lack of quality real-time information, and hard to capture behavioral level information. In this dissertation, I study how to extract information from digital traces to characterize human behavior in the built environment; and how to use such information to enhance decision-making processes in the area of Smart and Connected Communities. Specifically, I present three case studies that aim at using data-driven methods for decision-making in Smart and Connected Communities. First, I discuss data-driven methods for socioeconomic development with a focus on inference of socioeconomic maps with cell phone data. Second, I present data-driven methods for emergency preparedness and response, with a focus on understanding user needs in different communities with geotagged social media data. Third, I describe data-driven methods for migration studies, focusing on characterizing the post-migration behaviors of internal migrants with cell phone data. In these case studies, I present data-driven frameworks that integrate innovative behavior modeling approaches to help solve decision-making questions using digital traces. The explored methods enhance our understanding of how to model and explain population behavior patterns in different physical and socioeconomic contexts. The methods also have practical significance in terms of how decision-making can become cost-effective and efficient with the help of data-driven methods

    Revealing the Disciplinary Landscape of Data Science Journals

    Get PDF
    The discipline, field, and practice of data science emerged to its current prominence in the past several decades. New disciplines, fields, and practices often involve definitional and scope challenges. This seems to be the case with data science. The research presented in this poster is part of a broader investigation into the disciplinary or interdisciplinary characteristics of data science. This work-in-progress poster reports the results of analyses of data science journals in different subject areas to answer several questions including: • What is the population of journals that focus on topics of data science? • What disciplinary landscape of data science is revealed in the aims and scope statements of these journals? The unit of analysis in this research is at the journal level. Both quantitative and qualitative approaches were used in the analysis of the aim and scope statements. The quantitative approach used computational methods (e.g., Part-of-Speech Tagging, Word Embedding) to identify keywords representing characteristics of the journal. The qualitative approach used conceptual content analysis to reveal different patterns in terms of research types and the scope of research of the journals. Data science research and education are part of many library and information science degree programs. The results of this research have the following benefits: • Researchers can understand disciplinary and research types published in the journals when selecting a venue for submitting papers. • Educators and students can identify appropriate journal resources to support learning. • Librarians can use the results to assess collection development decisions regarding data science journals

    Hate Speech and Counter Speech Detection: Conversational Context Does Matter

    Full text link
    Hate speech is plaguing the cyberspace along with user-generated content. This paper investigates the role of conversational context in the annotation and detection of online hate and counter speech, where context is defined as the preceding comment in a conversation thread. We created a context-aware dataset for a 3-way classification task on Reddit comments: hate speech, counter speech, or neutral. Our analyses indicate that context is critical to identify hate and counter speech: human judgments change for most comments depending on whether we show annotators the context. A linguistic analysis draws insights into the language people use to express hate and counter speech. Experimental results show that neural networks obtain significantly better results if context is taken into account. We also present qualitative error analyses shedding light into (a) when and why context is beneficial and (b) the remaining errors made by our best model when context is taken into account.Comment: Accepted by NAACL 202

    Taxonomy of the genus Metolinus Cameron (Coleoptera, Staphylinidae, Staphylininae, Xantholinini) from China with description of three new species

    Get PDF
    This paper studies the taxonomy of the genus Metolinus Cameron, 1920 (Coleoptera: Staphylinidae, Staphylininae, Xantholinini) from China and describes three new species: Metolinus xizangensis sp. n. from Xizang (Tibet), M. emarginatus sp. n. from Sichuan, and M. binarius sp. n. from Yunnan. The Chinese fauna of the genus is thus increased to 8 species in total. A key to eight Chinese species is provided. Female genital segments and other important morphological characters are illustrated in line drawings for the new species as well as M. shanicus Bordoni, 2002 and M. gardneri (Cameron, 1945). The text also provides color plates with habitus photographs and a map to show the species’ geographical distribution pattern. The type specimens of the new species are deposited in Institute of Zoology, the Chinese Academy of Sciences (IZ-CAS)

    Two new species of Xanthophius Motschulsky (Coleoptera: Staphylinidae, Staphylininae, Xantholinini) from China with notes on X. filum (Kraatz)

    No full text
    Zhou, Yu-Lingzi, Zhou, Hong-Zhang (2013): Two new species of Xanthophius Motschulsky (Coleoptera: Staphylinidae, Staphylininae, Xantholinini) from China with notes on X. filum (Kraatz). Zootaxa 3626 (3): 363-380, DOI: 10.11646/zootaxa.3626.3.

    Topic Models to Infer Socio-Economic Maps

    No full text
    Socio-economic maps contain important information regarding the population of a country. Computing these maps is critical given that policy makers often times make important decisions based upon such information. However, the compilation of socio-economic maps requires extensive resources and becomes highly expensive. On the other hand, the ubiquitous presence of cell phones, is generating large amounts of spatiotemporal data that can reveal human behavioral traits related to specific socio-economic characteristics. Traditional inference approaches have taken advantage of these datasets to infer regional socio-economic characteristics. In this paper, we propose a novel approach whereby topic models are used to infer socio-economic levels from large-scale spatio-temporal data. Instead of using a pre-determined set of features, we use latent Dirichlet Allocation (LDA) to extract latent recurring patterns of co-occurring behaviors across regions, which are then used in the prediction of socio-economic levels. We show that our approach improves state of the art prediction results by 9%
    corecore