1,815 research outputs found

    OSINT Research Studios: A Flexible Crowdsourcing Framework to Scale Up Open Source Intelligence Investigations

    Full text link
    Open Source Intelligence (OSINT) investigations, which rely entirely on publicly available data such as social media, play an increasingly important role in solving crimes and holding governments accountable. The growing volume of data and complex nature of tasks, however, means there is a pressing need to scale and speed up OSINT investigations. Expert-led crowdsourcing approaches show promise but tend to either focus on narrow tasks or domains or require resource-intense, long-term relationships between expert investigators and crowds. We address this gap by providing a flexible framework that enables investigators across domains to enlist crowdsourced support for the discovery and verification of OSINT. We use a design-based research (DBR) approach to develop OSINT Research Studios (ORS), a sociotechnical system in which novice crowds are trained to support professional investigators with complex OSINT investigations. Through our qualitative evaluation, we found that ORS facilitates ethical and effective OSINT investigations across multiple domains. We also discuss broader implications of expert-crowd collaboration and opportunities for future work.Comment: To be published in CSCW 202

    An investigation into the role of crowdsourcing in generating information for flood risk management

    Get PDF
    Flooding is a major global hazard whose management relies on an accurate understanding of its risks. Crowdsourcing represents a major opportunity for supporting flood risk management as members of the public are highly capable of producing useful flood information. This thesis explores a wide range of issues related to flood crowdsourcing using an interdisciplinary approach. Through an examination of 31 different projects a flood crowdsourcing typology was developed. This identified five key types of flood crowdsourcing: i) Incident Reporting, ii) Media Engagement, iii) Collaborative Mapping, iv) Online Volunteering and v) Passive VGI. These represent a wide range of initiatives with radically different aims, objectives, datasets and relationships with volunteers. Online Volunteering was explored in greater detail using Tomnod as a case study. This is a micro-tasking platform in which volunteers analyse satellite imagery to support disaster response. Volunteer motivations for participating on Tomnod were found to be largely altruistic. Demographics of participants were significant, with retirement, disability or long-term health problems identified as major drivers for participation. Many participants emphasised that effective communication between volunteers and the site owner is strongly linked to their appreciation of the platform. In addition, the feedback on the quality and impact of their contributions was found to be crucial in maintaining interest. Through an examination of their contributions, volunteers were found to be able to ascertain with a higher degree of accuracy, many features in satellite imagery which supervised image classification struggled to identify. This was more pronounced in poorer quality imagery where image classification had a very low accuracy. However, supervised classification was found to be far more systematic and succeeded in identifying impacts in many regions which were missed by volunteers. The efficacy of using crowdsourcing for flood risk management was explored further through the iterative development of a Collaborative Mapping web-platform called Floodcrowd. Through interviews and focus groups, stakeholders from the public and private sector expressed an interest in crowdsourcing as a tool for supporting flood risk management. Types of data which stakeholders are particularly interested in with regards to crowdsourcing differ between organisations. Yet, they typically include flood depths, photos, timeframes of events and historical background information. Through engagement activities, many citizens were found to be able and motivated to share such observations. Yet, motivations were strongly affected by the level of attention their contributions receive from authorities. This presents many opportunities as well as challenges for ensuring that the future of flood crowdsourcing improves flood risk management and does not damage stakeholder relationships with participants

    ImageNet Large Scale Visual Recognition Challenge

    Get PDF
    The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the five years of the challenge, and propose future directions and improvements.Comment: 43 pages, 16 figures. v3 includes additional comparisons with PASCAL VOC (per-category comparisons in Table 3, distribution of localization difficulty in Fig 16), a list of queries used for obtaining object detection images (Appendix C), and some additional reference

    Neighborhood Watch 2.0: Private Surveillance and the Internet of Things

    Get PDF
    The use of low-cost cameras and internet-connected sensors is sharply increasing among local law enforcement, businesses, and average Americans. While the motives behind adopting these devices may differ, this trend means more data about the events on Earth is rapidly being collected and aggregated each day. Current and future products, such as drones and self-driving cars, contain cameras and other embedded sensors used by private individuals in public settings. To function, these devices must passively collect information about other individuals who have not given the express consent that is commonly required when one is actively using an online service, such as email or social media. Generally, courts do not recognize a right to privacy once a person enters public spaces. However, the impending convergence of privately-owned sensors gathering information about the surrounding world creates a new frontier in which to consider private liberties, community engagement, and civic duties. This Article will analyze the legal and technological developments surrounding: (1) existing data sources used by local law enforcement; (2) corporate assistance with law enforcement investigations; and (3) volunteering of personal data to make communities safer. After weighing relative privacy interests, this Article will explain, under current laws, the utility of private data to make communities safer, while simultaneously advancing the goals of fiscal responsibility, government accountability, and community engagement

    Technical Guidelines to Extract and Analyze VGI from Different Platforms

    Get PDF
    An increasing number of Volunteered Geographic Information (VGI) and social media platforms have been continuously growing in size, which have provided massive georeferenced data in many forms including textual information, photographs, and geoinformation. These georeferenced data have either been actively contributed (e.g., adding data to OpenStreetMap (OSM) or Mapillary) or collected in a more passive fashion by enabling geolocation whilst using an online platform (e.g., Twitter, Instagram, or Flickr). The benefit of scraping and streaming these data in stand-alone applications is evident, however, it is difficult for many users to script and scrape the diverse types of these data. On 14 June 2016, a pre-conference workshop at the AGILE 2016 conference in Helsinki, Finland was held. The workshop was called “LINK-VGI: LINKing and analyzing VGI across different platforms”. The workshop provided an opportunity for interested researchers to share ideas and findings on cross-platform data contributions. One portion of the workshop was dedicated to a hands-on session. In this session, the basics of spatial data access through selected Application Programming Interfaces (APIs) and the extraction of summary statistics of the results were illustrated. This paper presents the content of the hands-on session including the scripts and guidelines for extracting VGI data. Researchers, planners, and interested end-users can benefit from this paper for developing their own application for any region of the world

    A GENERAL MODEL FOR NOISY LABELS IN MACHINE LEARNING

    Get PDF
    Machine learning is an ever-growing and increasingly pervasive presence in every-day life; we entrust these models, and systems built on these models, with some of our most sensitive information and security applications. However, for all of the trust that we place in these models, it is essential to recognize the fact that such models are simply reflections of the data and labels on which they are trained. To wit, if the data and labels are suspect, then so too must be the models that we rely on—yet, as larger and more comprehensive datasets become standard in contemporary machine learning, it becomes increasingly more difficult to obtain reliable, trustworthy label information. While recent work has begun to investigate mitigating the effect of noisy labels, to date this critical field has been disjointed and disconnected, despite the common goal. In this work, we propose a new model of label noise, which we call “labeler-dependent noise (LDN).” LDN extends and generalizes the canonical instance-dependent noise model to multiple labelers, and unifies every pre-ceding modeling strategy under a single umbrella. Furthermore, studying the LDN model leads us to propose a more general, modular framework for noise-robust learning called “labeler-aware learning (LAL).” Our comprehensive suite of experiments demonstrate that unlike previous methods that are unable to remain robust under the general LDN model, LAL retains its full learning capabilities under extreme, and even adversarial, conditions of label noise. We believe that LDN and LAL should mark a paradigm shift in how we learn from labeled data, so that we may both discover new insights about machine learning, and develop more robust, trustworthy models on which to build our daily lives

    Computer Vision for Multimedia Geolocation in Human Trafficking Investigation: A Systematic Literature Review

    Full text link
    The task of multimedia geolocation is becoming an increasingly essential component of the digital forensics toolkit to effectively combat human trafficking, child sexual exploitation, and other illegal acts. Typically, metadata-based geolocation information is stripped when multimedia content is shared via instant messaging and social media. The intricacy of geolocating, geotagging, or finding geographical clues in this content is often overly burdensome for investigators. Recent research has shown that contemporary advancements in artificial intelligence, specifically computer vision and deep learning, show significant promise towards expediting the multimedia geolocation task. This systematic literature review thoroughly examines the state-of-the-art leveraging computer vision techniques for multimedia geolocation and assesses their potential to expedite human trafficking investigation. This includes a comprehensive overview of the application of computer vision-based approaches to multimedia geolocation, identifies their applicability in combating human trafficking, and highlights the potential implications of enhanced multimedia geolocation for prosecuting human trafficking. 123 articles inform this systematic literature review. The findings suggest numerous potential paths for future impactful research on the subject
    • …
    corecore