6,121 research outputs found

    Global disease monitoring and forecasting with Wikipedia

    Full text link
    Infectious disease is a leading threat to public health, economic stability, and other key social structures. Efforts to mitigate these impacts depend on accurate and timely monitoring to measure the risk and progress of disease. Traditional, biologically-focused monitoring techniques are accurate but costly and slow; in response, new techniques based on social internet data such as social media and search queries are emerging. These efforts are promising, but important challenges in the areas of scientific peer review, breadth of diseases and countries, and forecasting hamper their operational usefulness. We examine a freely available, open data source for this use: access logs from the online encyclopedia Wikipedia. Using linear models, language as a proxy for location, and a systematic yet simple article selection procedure, we tested 14 location-disease combinations and demonstrate that these data feasibly support an approach that overcomes these challenges. Specifically, our proof-of-concept yields models with r2r^2 up to 0.92, forecasting value up to the 28 days tested, and several pairs of models similar enough to suggest that transferring models from one location to another without re-training is feasible. Based on these preliminary results, we close with a research agenda designed to overcome these challenges and produce a disease monitoring and forecasting system that is significantly more effective, robust, and globally comprehensive than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein and adjust novelty claims accordingly; revise title; various revisions for clarit

    Training of Crisis Mappers and Map Production from Multi-sensor Data: Vernazza Case Study (Cinque Terre National Park, Italy)

    Get PDF
    This aim of paper is to presents the development of a multidisciplinary project carried out by the cooperation between Politecnico di Torino and ITHACA (Information Technology for Humanitarian Assistance, Cooperation and Action). The goal of the project was the training in geospatial data acquiring and processing for students attending Architecture and Engineering Courses, in order to start up a team of "volunteer mappers". Indeed, the project is aimed to document the environmental and built heritage subject to disaster; the purpose is to improve the capabilities of the actors involved in the activities connected in geospatial data collection, integration and sharing. The proposed area for testing the training activities is the Cinque Terre National Park, registered in the World Heritage List since 1997. The area was affected by flood on the 25th of October 2011. According to other international experiences, the group is expected to be active after emergencies in order to upgrade maps, using data acquired by typical geomatic methods and techniques such as terrestrial and aerial Lidar, close-range and aerial photogrammetry, topographic and GNSS instruments etc.; or by non conventional systems and instruments such us UAV, mobile mapping etc. The ultimate goal is to implement a WebGIS platform to share all the data collected with local authorities and the Civil Protectio

    The Internet Ecosystem: The Potential for Discrimination

    Get PDF
    Symposium: Rough Consensus and Running Code: Integrating Engineering Principles into Internet Policy Debates, held at the University of Pennsylvania\u27s Center for Technology Innovation and Competition on May 6-7, 2010. This Article explores how the emerging Internet architecture of cloud computing, content distribution networks, private peering and data-center services can simultaneously foster a perception of unfair network access while at the same time enabling significant competition for services, content, and innovation. A key enabler of these changes is the emergence of technologies that lower the barrier for entry in developing and deploying new services. Another is the design of successful Internet applications, which already accommodate the variation in service afforded by the current Internet. Regulators should be aware of the potential for anti-competitive practices in this broader Internet Ecosystem, but should carefully consider the effects of regulation on that ecosystem

    Search engine bias: the structuration of traffic on the World-Wide Web

    Get PDF
    Search engines are essential components of the World Wide Web; both commercially and in terms of everyday usage, their importance is hard to overstate. This thesis examines the question of why there is bias in search engine results – bias that invites users to click on links to large websites, commercial websites, websites based in certain countries, and websites written in certain languages. In this thesis, the historical development of the search engine industry is traced. Search engines first emerged as prototypical technological startups emanating from Silicon Valley, followed by the acquisition of search engine companies by major US media corporations and their development into portals. The subsequent development of pay-per-click advertising is central to the current industry structure, an oligarchy of virtually integrated companies managing networks of syndicated advertising and traffic distribution. The study also shows a global landscape in which search production is concentrated in and caters for large global advertising markets, leaving the rest of the world with patchy and uneven search results coverage. The analysis of interviews with senior search engine engineers indicates that issues of quality are addressed in terms of customer service and relevance in their discourse, while the analysis of documents, interviews with search marketers, and participant observation within a search engine marketing firm showed that producers and marketers had complex relationships that combine aspects of collaboration, competition, and indifference. The results of the study offer a basis for the synthesis of insights of the political economy of media and communication and the social studies of technology tradition, emphasising the importance of culture in constructing and maintaining both local structures and wider systems. In the case of search engines, the evidence indicates that the culture of the technological entrepreneur is very effective in creating a new megabusiness, but less successful in encouraging a debate on issues of the public good or public responsibility as they relate to the search engine industry

    Cyber indicators of compromise: a domain ontology for security information and event management

    Get PDF
    It has been said that cyber attackers are attacking at wire speed (very fast), while cyber defenders are defending at human speed (very slow). Researchers have been working to improve this asymmetry by automating a greater portion of what has traditionally been very labor-intensive work. This work is involved in both the monitoring of live system events (to detect attacks), and the review of historical system events (to investigate attacks). One technology that is helping to automate this work is Security Information and Event Management (SIEM). In short, SIEM technology works by aggregating log information, and then sifting through this information looking for event correlations that are highly indicative of attack activity. For example: Administrator successful local logon and (concurrently) Administrator successful remote logon. Such correlations are sometimes referred to as indicators of compromise (IOCs). Though IOCs for network-based data (i.e., packet headers and payload) are fairly mature (e.g., Snort's large rule-base), the field of end-device IOCs is still evolving and lacks any well-defined go-to standard accepted by all. This report addresses ontological issues pertaining to end-device IOCs development, including what they are, how they are defined, and what dominant early standards already exist.http://archive.org/details/cyberindicatorso1094553041Lieutenant, United States NavyApproved for public release; distribution is unlimited
    corecore