    Characterizing Geo-located Tweets in Brazilian Megacities

    Full text link
    This work presents a framework for collecting, processing and mining geo-located tweets in order to extract meaningful and actionable knowledge in the context of smart cities. We collected and characterized more than 9M tweets from the two biggest cities in Brazil, Rio de Janeiro and S\~ao Paulo. We performed topic modeling using the Latent Dirichlet Allocation model to produce an unsupervised distribution of semantic topics over the stream of geo-located tweets as well as a distribution of words over those topics. We manually labeled and aggregated similar topics obtaining a total of 29 different topics across both cities. Results showed similarities in the majority of topics for both cities, reflecting similar interests and concerns among the population of Rio de Janeiro and S\~ao Paulo. Nevertheless, some specific topics are more predominant in one of the cities

    Achieving Green and Healthy Homes and Communities in America

    Get PDF
    In the Fall of 2010, the National Coalition to End Childhood Lead Poisioning contracted with the National Academy to develop and execute an online dialogue that would examine ways to increase the health, safety, and energy efficiency of low- to moderate-income homes. Since 1999, the National Coalition had worked to improve low- to moderate-income housing through the support and execution of home interventions that addressed multiple issues within a home at one time; an approach that often did not align with other traditional, single-issue housing assistance programs. By 2010, the National Coalition had taken on the leadership of the Green and Healthy Homes Initiative, a public-private partnership focused on integrating funding streams to improve low- to middle-income homes across the country.With plans to expand the GHHI's operations, the National Coalition partnered with the National Academy to conduct the National Dialogue on Green and Healthy Homes, a collaborative online dailogue in which participants were asked to identify challenges to, and innovative practices for, improving the health, safety and energy-efficiency of low- to moderate- income homes. The Dialogue was live from November 4-November 22, 2010, and collected 100 hundred ideas and 362 comments from 320 registered users. Over the course of its two and a half week duration, the Dialogue received more than 2,500 visits from over 1,100 people in 48 states and territories. Key FindingsBy reviewing the feedback received in the Dialogue, the Panel was able to make a number of recommendations on how the green and healthy homes community of practice could increase the health, safety and energy efficiency of homes across the country. These recommendations included: Conduct an evaluation of current housing standards to determine if they meet the Nation's health, safety, and energy efficiency needs; Develop a tiered performance standard for healthy, safe and energy efficient homes; Group government funding streams to better align programs with the comprehensive intervention approach; Develop a long-term funding strategy to support efforts after Recovery Act funding ends; and Educate government decisionmakers and the public on the importance of developing green and healthy homes and communities, and the work that supports that development

    Mapping the Current Landscape of Research Library Engagement with Emerging Technologies in Research and Learning: Final Report

    Get PDF
    The generation, dissemination, and analysis of digital information is a significant driver, and consequence, of technological change. As data and information stewards in physical and virtual space, research libraries are thoroughly entangled in the challenges presented by the Fourth Industrial Revolution:1 a societal shift powered not by steam or electricity, but by data, and characterized by a fusion of the physical and digital worlds.2 Organizing, structuring, preserving, and providing access to growing volumes of the digital data generated and required by research and industry will become a critically important function. As partners with the community of researchers and scholars, research libraries are also recognizing and adapting to the consequences of technological change in the practices of scholarship and scholarly communication. Technologies that have emerged or become ubiquitous within the last decade have accelerated information production and have catalyzed profound changes in the ways scholars, students, and the general public create and engage with information. The production of an unprecedented volume and diversity of digital artifacts, the proliferation of machine learning (ML) technologies,3 and the emergence of data as the “world’s most valuable resource,”4 among other trends, present compelling opportunities for research libraries to contribute in new and significant ways to the research and learning enterprise. Librarians are all too familiar with predictions of the research library’s demise in an era when researchers have so much information at their fingertips. A growing body of evidence provides a resounding counterpoint: that the skills, experience, and values of librarians, and the persistence of libraries as an institution, will become more important than ever as researchers contend with the data deluge and the ephemerality and fragility of much digital content. This report identifies strategic opportunities for research libraries to adopt and engage with emerging technologies,5 with a roughly fiveyear time horizon. It considers the ways in which research library values and professional expertise inform and shape this engagement, the ways library and library worker roles will be reconceptualized, and the implication of a range of technologies on how the library fulfills its mission. The report builds on a literature review covering the last five years of published scholarship, primarily North American information science literature, and interviews with a dozen library field experts, completed in fall 2019. It begins with a discussion of four cross-cutting opportunities that permeate many or all aspects of research library services. Next, specific opportunities are identified in each of five core research library service areas: facilitating information discovery, stewarding the scholarly and cultural record, advancing digital scholarship, furthering student learning and success, and creating learning and collaboration spaces. Each section identifies key technologies shaping user behaviors and library services, and highlights exemplary initiatives. Underlying much of the discussion in this report is the idea that “digital transformation is increasingly about change management”6 —that adoption of or engagement with emerging technologies must be part of a broader strategy for organizational change, for “moving emerging work from the periphery to the core,”7 and a broader shift in conceptualizing the research library and its services. Above all, libraries are benefitting from the ways in which emerging technologies offer opportunities to center users and move from a centralized and often siloed service model to embedded, collaborative engagement with the research and learning enterprise

    Benefits and Risks of Big Data Analytics in Fragile and Conflict Affected States

    Get PDF
    Big Data is an umbrella term for the large amounts of digital data continually generated by the global population. The main sources are data exhaust (largely from the use of mobile phones), online information (e.g. social media), physical sensors (e.g. satellite imagery) and crowdsourced data (from citizens). Big Data for Development refers to sources of Big Data relevant to policy and programming of development programmes. Such data has the following features: digitally generated, passively produced, automatically collected, geographically or temporally trackable, and continuously analysed. Big Data analytics refers to the process of turning raw data into actionable information. The biggest source of Big Data is data exhaust, much of which is held by the private sector. Donor interest in Big Data for Development has increased hugely in recent years. Big Data for Development has potential applications in numerous sectors, notably health, education, financial services and agriculture. It can be especially useful in fragile and conflict-affected states where, for multiple reasons (insecurity, lack of capacity, population movements, etc.), availability of traditional data (e.g. official statistics) tends to be very limited. The speed with which Big Data is generated can reduce the time lag between the start of a trend/development and when governments and other authorities are able to respond; it can also reduce the knowledge gap about how people respond to these trends

    Dataperusteinen palaute eTerveyspalveluiden sisällöntuotantoon

    Get PDF
    Web analytics has proven significant potential for constantly improving the provided web-based services and applications. By analyzing interaction data collected from web applications, it is possible to study how the applications are used in detail. The focus of this study is to analyze if interaction data collected with Piwik PRO web analytics platform using JavaScript tagging can provide sufficient detail about user behaviour and interaction in a modern single-page web application. Furthermore, the analysis seeks to answer if the collected data can be refined in a way that will help the content managers of the web application to continuously improve the content and to spot dysfunctional content. The research is based on Omapolku, a Finnish public e-health service providing digital services for personalized healthcare. In this study, the analysis focuses on evaluating digital treatment pathways in Omapolku, which provides various types of information and utilities designed for the needs of specific patient groups. The evaluation is based on the graphical user interface of a treatment pathway view by analyzing a sample dataset consisting of actions performed by the users. The data is analyzed with general web analytics metrics and by applying statistical analyses of web usage mining. The results show that the interaction data can provide necessary detail for evaluating general usage metrics and basic usage patterns. However, the results show that the data does not provide necessary information for identifying most actions performed by the users, which makes it practically impossible to link the data to the front-end components of the user interface. As an outcome of this study, it is recommended that additional identifiers are added to the front-end components of the treatment path interface and that the JavaScript tagging script is modified to record the corresponding identifiers and the action context. In addition, a novel prototype was designed as a solution to the identified challenges and to support the work of the content managers.Web-analytiikka on osoittanut nykypäivänä potentiaalinsa osana web-pohjaisten sovellusten jatkuvaa kehitystä. Web-sovelluksista kerätyn interaktiodatan analysointi mahdollistaa sen, että sovellusten käyttöä voidaan tutkia yksityiskohtaisesti. Tämä työ keskittyy analysoimaan, mikäli Piwik PRO analytiikkapalvelun JavaScript seurantakoodilla kerätty interaktiodata tarjoaa riittäviä yksityiskohtia käyttäjien käyttäytymisestä ja interaktiosta yksisivuisessa web-sovelluksessa. Tämän lisäksi työ keskittyy tutkimaan, mikäli kerättyä dataa voidaan jalostaa siten, että sitä voi hyödyntää toimintahäiriöisten sisältöjen paikantamiseen sekä sisällön jatkuvaan kehittämiseen. Tutkimus perustuu Omapolku-sovellukseen, joka on julkinen suomalainen eTerveyspalvelu. Omapolku tarjoaa digitaalisia palveluita henkilökohtaiseen terveydenhuoltoon. Tässä työssä analyysi perustuu Omapolun digitaalisien hoitopolkujen toimivuuden arvioimiseen. Digitaaliset hoitopolut tarjoavat monipuolista tietoa sekä työkaluja, jotka on suunniteltu potilasryhmäkohtaisesti tietyn hoitotarpeen mukaisesti. Hoitopolkujen toimivuuden arvointi toteutetaan tutkimalla digihoitopolkujen graafisesta käyttöliittymästä kerättyä interaktiodataa. Kerättyä dataa analysoidaan yleisillä web-analytiikan mittareilla sekä tilastollisilla web-tiedonlouhinnan menetelmillä. Työn tulokset osoittavat, että interaktiodata voi tarjota tarpeellista tietoa yleisten mittareiden laskemiseksi sekä yksinkertaisten käyttäytymismallien selvittämiseksi. Tulokset myös osoittavat, että data ei tarjoa tietoa yksityiskohtaisten tapahtumien alkuperän selvittämiseksi käyttöliittymässä. Työn tuloksena suositellaan, että digihoitopolkujen käyttöliittymän komponentteihin lisätään lisätunnisteita ja että JavaScript seurantakoodia muokataan siten, että tapahtuman konteksti ja siihen liittyvä komponenttitunniste tallennetaan tapahtumaan. Tämän lisäksi työssä esitetään prototyyppi ratkaisuna havaittuihin haasteisiin sekä tukemaan sisällöntuottajien työtä

    Leveraging analytics to produce compelling and profitable film content

    Get PDF
    Producing compelling film content profitably is a top priority to the long-term prosperity of the film industry. Advances in digital technologies, increasing availabilities of granular big data, rapid diffusion of analytic techniques, and intensified competition from user generated content and original content produced by Subscription Video on Demand (SVOD) platforms have created unparalleled needs and opportunities for film producers to leverage analytics in content production. Built upon the theories of value creation and film production, this article proposes a conceptual framework of key analytic techniques that film producers may engage throughout the production process, such as script analytics, talent analytics, and audience analytics. The article further synthesizes the state-of-the-art research on and applications of these analytics, discuss the prospect of leveraging analytics in film production, and suggest fruitful avenues for future research with important managerial implications

    Harmed while anonymous : beyond the personal/non-personal distinction in data governance

    Get PDF
    Data law and policy assume that harms to individuals can result only from personal data processing. Conversely, generation and use of non-personal data supposedly create new value while presenting no risk to individual interests or fundamental rights. Consequently, the law treats these two categories differently, constraining generation, use, and sharing of the former while incentivizing the latter. This article challenges this assumption. It proposes to divide data-related harms into two high-level categories: unwanted disclosure and detrimental use. It demonstrates how personal/non-personal data distinction prevents unwanted disclosure but fails to capture, and unintendedly enables, detrimental use of data. As a remedy, the article proposes a new concept – data about humans – and illustrates how it could advance data law and policy