1,197 research outputs found

    Development of a national-scale real-time Twitter data mining pipeline for social geodata on the potential impacts of flooding on communities

    Get PDF
    International audienceSocial media, particularly Twitter, is increasingly used to improve resilience during extreme weather events/emergency management situations, including floods: by communicating potential risks and their impacts, and informing agencies and responders. In this paper, we developed a prototype national-scale Twitter data mining pipeline for improved stakeholder situational awareness during flooding events across Great Britain, by retrieving relevant social geodata, grounded in environmental data sources (flood warnings and river levels). With potential users we identified and addressed three research questions to develop this application, whose components constitute a modular architecture for real-time dashboards. First, polling national flood warning and river level Web data sources to obtain at-risk locations. Secondly, real-time retrieval of geotagged tweets, proximate to at-risk areas. Thirdly, filtering flood-relevant tweets with natural language processing and machine learning libraries, using word embeddings of tweets. We demonstrated the national-scale social geodata pipeline using over 420,000 georeferenced tweets obtained between 20-29th June 2016. Highlights • Prototype real-time social geodata pipeline for flood events and demonstration dataset • National-scale flood warnings/river levels set 'at-risk areas' in Twitter API queries • Monitoring multiple locations (without keywords) retrieved current, geotagged tweets • Novel application of word embeddings in flooding context identified relevant tweets • Pipeline extracts tweets to visualise using open-source libraries (SciKit Learn/Gensim) Keywords Flood management; Twitter; volunteered geographic information; natural language processing; word embeddings; social geodata. Hardware required: Intel i3 or mid-performance PC with multicore processor and SSD main drive, 8Gb memory recommended. Software required: Python and library dependencies specified in Appendix A1.2.1, (viii) environment.yml Software availability: All source code can be found at GitHub public repositorie

    General pilot model and use case definition

    Get PDF
    This report describes the concepts and elements of the General Model of E-ARK pilot site activities

    Group-privacy threats for geodata in the humanitarian context

    Get PDF
    The role of geodata technologies in humanitarian action is arguably indispensable in determining when, where, and who needs aid before, during, and after a disaster. However, despite the advantages of using geodata technologies in humanitarianism (i.e., fast and efficient aid distribution), several ethical challenges arise, including privacy. The focus has been on individual privacy; however, in this article, we focus on group privacy, a debate that has recently gained attention. We approach privacy through the lens of informational harms that undermine the autonomy of groups and control of knowledge over them. Using demographically identifiable information (DII) as a definition for groups, we first assess how these are derived from geodata types used in humanitarian DRRM. Second, we discuss four informational-harm threat models: (i) biases from missing/underrepresented categories, (ii) the mosaic effect—unintentional sensitive knowledge discovery from combining disparate datasets, (iii) misuse of data (whether it is shared or not); and (iv) cost–benefit analysis (cost of protection vs. risk of misuse). Lastly, borrowing from triage in emergency medicine, we propose a geodata triage process as a possible method for practitioners to identify, prioritize, and mitigate these four group-privacy harms

    Innovative approaches to urban data management using emerging technologies

    Get PDF
    Many characteristics of Smart cities rely on a sufficient quantity and quality of urban data. Local industry and developers can use this data for application development that improves life of all citizens. Therefore, the handling and usability of this data is a big challenge for smart cities. In this paper we investigate new approaches to urban data management using emerging technologies and give an insight on further research conducted within the EC-funded smarticipate project. Geospatial data cannot be handled well in classical relational database environments. Either they are just put in as binary large objects or have to be broken down into elementary types which can be handled by the database, in many cases resulting in a slow system, since the database technology is not really tuned for delivery on mass data as classical relational databases are optimized for online transaction processing and not analytic processing. Document-based databases provide a better performance, but still struggle with the challenge of large binary objects. Also the heterogeneity of data requires a lot of mapping and data cleansing, in some cases replication can’t be avoided. Another approach is to use Semantic Web technologies to enhance the data and build up relations and connections between entities. However, data formats such as RDF use a different approach and are not suitable for geospatial data leading to a lack on usability. Search engines are a good example of web applications with a high usability. The users must be able to find the right data and get information of related or close matches. This allows information retrieval in an easy to use fashion. The same principles should be applied to geospatial data, which would improve the usability of open data. Combined with data mining and big data technologies those principles would improve the usability of open geospatial data and even lead to new ways to use it. By helping with the interpretation of data in a certain context data is transformed into useful information. In this paper we analyse key features of open geodata portals such as linked data and machine learning in order to show ways of improving the user experience. Based on the Smarticipate projects we show afterwards as open data and geo data online and see the practical application. We also give an outlook on piloting cases where we want to evaluate, how the technologies presented in this paper can be combined to a usefull open data portal. In contrast to the previous EC-funded project urbanapi, where participative processes in smart cities where created with urban data, we go one step further with semantic web and open data. Thereby we achieve a more general approach on open data portals for spatial data and how to improve their usability. The envisioned architecture of the smarticipate project relies on file based storage and a no-copy strategy, which means that data is mostly kept in its original format, a conversion to another format is only done if necessary (e.g. the current format has limitations on domain specific attributes or the user requests a specific format). A strictly functional approach and architecture is envisioned which allows a massively parallel execution and therefore is predestined to be deployed in a cloud environment. The actual search interface uses a domain specific vocabulary which can be customised for special purposes or for users that consider their context and expertise, which should abstract from technology specific peculiarities. Also application programmers will benefit form this architecture as linked data principles will be followed extensively. For example, the JSON and JSON-LD standards will be used, so that web developers can use results of the data store directly without the need for conversion. Also links to further information will be provided within the data, so that a drill down is possible for more details. The remainder of this paper is structured as follows. After the introduction about open data and data in general we look at related work and existing open data portals. This leads to the main chapter about the key technology aspects for an easy-to-use open data portal. This is followed by Chapter five, an introduction of the EC-funded project smarticipate, in which the key technology aspects of chapter four will be included

    Achieving Sustainability Through Geodata: An Empirical Study of Challenges and Barriers

    Get PDF
    Master's thesis in Information systems (IS501)Research within data management is often based on the elements of the data lifecycle. Organizations and businesses are also becoming more interested in data lifecycle management to leverage their data streams, compounded by an interest in geographical attributes within the data –referred to as geodata. Geodata provides a richer basis for analysis and is increasingly important within urban planning. Furthermore, the pressure to achieve sustainability goals calls for improving the data lifecycle. The challenge remainsas to what can be improvedwithin the data lifecycle –with geodata as an important input –to achieve sustainability dimensions. Our main contribution through this study is shedding light on challenges withgeodata from an Information Systems (IS) and sustainability perspective. Additionally, the identified challenges are also feedback to data management research and the data lifecycle

    E‐ARK Dissemination Information Package (DIP) Final Specification

    Get PDF
    The primary aim of this report is to present the final version of the E-ARK Dissemination Information Package (DIP) formats. The secondary aim is to describe the access scenarios in which these DIP formats will be rendered for use

    Opportunities and challenges of geospatial analysis for promoting urban livability in the era of big data and machine learning

    Get PDF
    Urban systems involve a multitude of closely intertwined components, which are more measurable than before due to new sensors, data collection, and spatio-temporal analysis methods. Turning these data into knowledge to facilitate planning efforts in addressing current challenges of urban complex systems requires advanced interdisciplinary analysis methods, such as urban informatics or urban data science. Yet, by applying a purely data-driven approach, it is too easy to get lost in the ‘forest’ of data, and to miss the ‘trees’ of successful, livable cities that are the ultimate aim of urban planning. This paper assesses how geospatial data, and urban analysis, using a mixed methods approach, can help to better understand urban dynamics and human behavior, and how it can assist planning efforts to improve livability. Based on reviewing state-of-the-art research the paper goes one step further and also addresses the potential as well as limitations of new data sources in urban analytics to get a better overview of the whole ‘forest’ of these new data sources and analysis methods. The main discussion revolves around the reliability of using big data from social media platforms or sensors, and how information can be extracted from massive amounts of data through novel analysis methods, such as machine learning, for better-informed decision making aiming at urban livability improvement

    Big Data Management for Cloud-Enabled Geological Information Services

    Get PDF
    corecore