1,561 research outputs found

    Using hybrid algorithmic-crowdsourcing methods for academic knowledge acquisition

    Get PDF
    such as Figures, Tables, DeïŹnitions, Algo- rithms, etc., which are called Knowledge Cells hereafter. An advanced academic search engine which could take advantage of Knowledge Cells and their various relation- ships to obtain more accurate search results is expected. Further, it’s expected to provide a ïŹne-grained search regard- ing to Knowledge Cells for deep-level information discovery and exploration. Therefore, it is important to identify and extract the Knowledge Cells and their various relationships which are often intrinsic and implicit in articles. With the exponential growth of scientiïŹc publications, discovery and acquisition of such useful academic knowledge impose some practical challenges For example, existing algorithmic meth- ods can hardly extend to handle diverse layouts of journals, nor to scale up to process massive documents. As crowd- sourcing has become a powerful paradigm for large scale problem-solving especially for tasks that are difïŹcult for computers but easy for human, we consider the problem of academic knowledge discovery and acquisition as a crowd- sourced database problem and show a hybrid framework to integrate the accuracy of crowdsourcing workers and the speed of automatic algorithms. In this paper, we introduce our current system implementation, a platform for academic knowledge discovery and acquisition (PANDA), as well as some interesting observations and promising future directions.Peer reviewe

    Theoretical Underpinnings and Practical Challenges of Crowdsourcing as a Mechanism for Academic Study

    Get PDF
    Researchers in a variety of fields are increasingly adopting crowdsourcing as a reliable instrument for performing tasks that are either complex for humans and computer algorithms. As a result, new forms of collective intelligence have emerged from the study of massive crowd-machine interactions in scientific work settings as a field for which there is no known theory or model able to explain how it really works. Such type of crowd work uses an open participation model that keeps the scientific activity (including datasets, methods, guidelines, and analysis results) widely available and mostly independent from institutions, which distinguishes crowd science from other crowd-assisted types of participation. In this paper, we build on the practical challenges of crowd-AI supported research and propose a conceptual framework for addressing the socio-technical aspects of crowd science from a CSCW viewpoint. Our study reinforces a manifested lack of systematic and empirical research of the symbiotic relation of AI with human computation and crowd computing in scientific endeavors

    Towards an epistemology of data journalism in the devolved nations of the United Kingdom: Changes and continuities in materiality, performativity and reflexivity

    Get PDF
    This article outlines a general epistemological framework of data journalism in the devolved nations of the UK. By using an original model based on three conceptual lenses—materiality, performativity and reflexivity—this study examines the development of this form of journalism, the challenges it faces, and its particularities in the context of Scotland, Wales and Northern Ireland. This research therefore offers unique insights from semi-structured interviews with data journalists and data editors based at, or working as freelancers for, the mainstream news organisations of these regions. The results suggest that data journalism in these devolved nations displays a distinctive character just as much as it reinforces the norms and rituals of the legacy organisations that pioneered this practice. Whilst various models of data exploitation are tested, regional data journalists creatively circumvent generalised organisational struggles to lay the groundwork for their trade and professional community

    Mobility Data Science (Dagstuhl Seminar 22021)

    Get PDF
    This report documents the program and the outcomes of Dagstuhl Seminar 22021 "Mobility Data Science". This seminar was held January 9-14, 2022, including 47 participants from industry and academia. The goal of this Dagstuhl Seminar was to create a new research community of mobility data science in which the whole is greater than the sum of its parts by bringing together established leaders as well as promising young researchers from all fields related to mobility data science. Specifically, this report summarizes the main results of the seminar by (1) defining Mobility Data Science as a research domain, (2) by sketching its agenda in the coming years, and by (3) building a mobility data science community. (1) Mobility data science is defined as spatiotemporal data that additionally captures the behavior of moving entities (human, vehicle, animal, etc.). To understand, explain, and predict behavior, we note that a strong collaboration with research in behavioral and social sciences is needed. (2) Future research directions for mobility data science described in this report include a) mobility data acquisition and privacy, b) mobility data management and analysis, and c) applications of mobility data science. (3) We identify opportunities towards building a mobility data science community, towards collaborations between academic and industry, and towards a mobility data science curriculum

    On the use of smartphones as novel photogrammetric water gauging instruments: Developing tools for crowdsourcing water levels

    Get PDF
    The term global climate change is omnipresent since the beginning of the last decade. Changes in the global climate are associated with an increase in heavy rainfalls that can cause nearly unpredictable flash floods. Consequently, spatio-temporally high-resolution monitoring of rivers becomes increasingly important. Water gauging stations continuously and precisely measure water levels. However, they are rather expensive in purchase and maintenance and are preferably installed at water bodies relevant for water management. Small-scale catchments remain often ungauged. In order to increase the data density of hydrometric monitoring networks and thus to improve the prediction quality of flood events, new, flexible and cost-effective water level measurement technologies are required. They should be oriented towards the accuracy requirements of conventional measurement systems and facilitate the observation of water levels at virtually any time, even at the smallest rivers. A possible solution is the development of a photogrammetric smartphone application (app) for crowdsourcing water levels, which merely requires voluntary users to take pictures of a river section to determine the water level. Today’s smartphones integrate high-resolution cameras, a variety of sensors, powerful processors, and mass storage. However, they are designed for the mass market and use low-cost hardware that cannot comply with the quality of geodetic measurement technology. In order to investigate the potential for mobile measurement applications, research was conducted on the smartphone as a photogrammetric measurement instrument as part of the doctoral project. The studies deal with the geometric stability of smartphone cameras regarding device-internal temperature changes and with the accuracy potential of rotation parameters measured with smartphone sensors. The results show a high, temperature-related variability of the interior orientation parameters, which is why the calibration of the camera should be carried out during the immediate measurement. The results of the sensor investigations show considerable inaccuracies when measuring rotation parameters, especially the compass angle (errors up to 90° were observed). The same applies to position parameters measured by global navigation satellite system (GNSS) receivers built into smartphones. According to the literature, positional accuracies of about 5 m are possible in best conditions. Otherwise, errors of several 10 m are to be expected. As a result, direct georeferencing of image measurements using current smartphone technology should be discouraged. In consideration of the results, the water gauging app Open Water Levels (OWL) was developed, whose methodological development and implementation constituted the core of the thesis project. OWL enables the flexible measurement of water levels via crowdsourcing without requiring additional equipment or being limited to specific river sections. Data acquisition and processing take place directly in the field, so that the water level information is immediately available. In practice, the user captures a short time-lapse sequence of a river bank with OWL, which is used to calculate a spatio-temporal texture that enables the detection of the water line. In order to translate the image measurement into 3D object space, a synthetic, photo-realistic image of the situation is created from existing 3D data of the river section to be investigated. Necessary approximations of the image orientation parameters are measured by smartphone sensors and GNSS. The assignment of camera image and synthetic image allows for the determination of the interior and exterior orientation parameters by means of space resection and finally the transfer of the image-measured 2D water line into the 3D object space to derive the prevalent water level in the reference system of the 3D data. In comparison with conventionally measured water levels, OWL reveals an accuracy potential of 2 cm on average, provided that synthetic image and camera image exhibit consistent image contents and that the water line can be reliably detected. In the present dissertation, related geometric and radiometric problems are comprehensively discussed. Furthermore, possible solutions, based on advancing developments in smartphone technology and image processing as well as the increasing availability of 3D reference data, are presented in the synthesis of the work. The app Open Water Levels, which is currently available as a beta version and has been tested on selected devices, provides a basis, which, with continuous further development, aims to achieve a final release for crowdsourcing water levels towards the establishment of new and the expansion of existing monitoring networks.Der Begriff des globalen Klimawandels ist seit Beginn des letzten Jahrzehnts allgegenwĂ€rtig. Die VerĂ€nderung des Weltklimas ist mit einer Zunahme von Starkregenereignissen verbunden, die nahezu unvorhersehbare Sturzfluten verursachen können. Folglich gewinnt die raumzeitlich hochaufgelöste Überwachung von FließgewĂ€ssern zunehmend an Bedeutung. Pegelmessstationen erfassen kontinuierlich und prĂ€zise WasserstĂ€nde, sind jedoch in Anschaffung und Wartung sehr teuer und werden vorzugsweise an wasserwirtschaftlich-relevanten GewĂ€ssern installiert. Kleinere GewĂ€sser bleiben hĂ€ufig unbeobachtet. Um die Datendichte hydrometrischer Messnetze zu erhöhen und somit die VorhersagequalitĂ€t von Hochwasserereignissen zu verbessern, sind neue, kostengĂŒnstige und flexibel einsetzbare Wasserstandsmesstechnologien erforderlich. Diese sollten sich an den Genauigkeitsanforderungen konventioneller Messsysteme orientieren und die Beobachtung von WasserstĂ€nden zu praktisch jedem Zeitpunkt, selbst an den kleinsten FlĂŒssen, ermöglichen. Ein Lösungsvorschlag ist die Entwicklung einer photogrammetrischen Smartphone-Anwendung (App) zum Crowdsourcing von WasserstĂ€nden mit welcher freiwillige Nutzer lediglich Bilder eines Flussabschnitts aufnehmen mĂŒssen, um daraus den Wasserstand zu bestimmen. Heutige Smartphones integrieren hochauflösende Kameras, eine Vielzahl von Sensoren, leistungsfĂ€hige Prozessoren und Massenspeicher. Sie sind jedoch fĂŒr den Massenmarkt konzipiert und verwenden kostengĂŒnstige Hardware, die nicht der QualitĂ€t geodĂ€tischer Messtechnik entsprechen kann. Um das Einsatzpotential in mobilen Messanwendungen zu eruieren, sind Untersuchungen zum Smartphone als photogrammetrisches Messinstrument im Rahmen des Promotionsprojekts durchgefĂŒhrt worden. Die Studien befassen sich mit der geometrischen StabilitĂ€t von Smartphone-Kameras bezĂŒglich gerĂ€teinterner TemperaturĂ€nderungen und mit dem Genauigkeitspotential von mit Smartphone-Sensoren gemessenen Rotationsparametern. Die Ergebnisse zeigen eine starke, temperaturbedingte VariabilitĂ€t der inneren Orientierungsparameter, weshalb die Kalibrierung der Kamera zum unmittelbaren Messzeitpunkt erfolgen sollte. Die Ergebnisse der Sensoruntersuchungen zeigen große Ungenauigkeiten bei der Messung der Rotationsparameter, insbesondere des Kompasswinkels (Fehler von bis zu 90° festgestellt). Selbiges gilt auch fĂŒr Positionsparameter, gemessen durch in Smartphones eingebaute EmpfĂ€nger fĂŒr Signale globaler Navigationssatellitensysteme (GNSS). Wie aus der Literatur zu entnehmen ist, lassen sich unter besten Bedingungen Lagegenauigkeiten von etwa 5 m erreichen. Abseits davon sind Fehler von mehreren 10 m zu erwarten. Infolgedessen ist von einer direkten Georeferenzierung von Bildmessungen mittels aktueller Smartphone-Technologie abzusehen. Unter BerĂŒcksichtigung der gewonnenen Erkenntnisse wurde die Pegel-App Open Water Levels (OWL) entwickelt, deren methodische Entwicklung und Implementierung den Kern der Arbeit bildete. OWL ermöglicht die flexible Messung von WasserstĂ€nden via Crowdsourcing, ohne dabei zusĂ€tzliche AusrĂŒstung zu verlangen oder auf spezifische Flussabschnitte beschrĂ€nkt zu sein. Datenaufnahme und Verarbeitung erfolgen direkt im Feld, so dass die Pegelinformationen sofort verfĂŒgbar sind. Praktisch nimmt der Anwender mit OWL eine kurze Zeitraffersequenz eines Flussufers auf, die zur Berechnung einer Raum-Zeit-Textur dient und die Erkennung der Wasserlinie ermöglicht. Zur Übersetzung der Bildmessung in den 3D-Objektraum wird aus vorhandenen 3D-Daten des zu untersuchenden Flussabschnittes ein synthetisches, photorealistisches Abbild der Aufnahmesituation erstellt. Erforderliche NĂ€herungen der Bildorientierungsparameter werden von Smartphone-Sensoren und GNSS gemessen. Die Zuordnung von Kamerabild und synthetischem Bild erlaubt die Bestimmung der inneren und Ă€ußeren Orientierungsparameter mittels rĂ€umlichen RĂŒckwĂ€rtsschnitt. Nach Rekonstruktion der Aufnahmesituation lĂ€sst sich die im Bild gemessene 2D-Wasserlinie in den 3D-Objektraum projizieren und der vorherrschende Wasserstand im Referenzsystem der 3D-Daten ableiten. Im Soll-Ist-Vergleich mit konventionell gemessenen Pegeldaten zeigt OWL ein erreichbares Genauigkeitspotential von durchschnittlich 2 cm, insofern synthetisches und reales Kamerabild einen möglichst konsistenten Bildinhalt aufweisen und die Wasserlinie zuverlĂ€ssig detektiert werden kann. In der vorliegenden Dissertation werden damit verbundene geometrische und radiometrische Probleme ausfĂŒhrlich diskutiert sowie LösungsansĂ€tze, auf der Basis fortschreitender Entwicklungen von Smartphone-Technologie und Bildverarbeitung sowie der zunehmenden VerfĂŒgbarkeit von 3D-Referenzdaten, in der Synthese der Arbeit vorgestellt. Mit der gegenwĂ€rtig als Betaversion vorliegenden und auf ausgewĂ€hlten GerĂ€ten getesteten App Open Water Levels wurde eine Basis geschaffen, die mit kontinuierlicher Weiterentwicklung eine finale Freigabe fĂŒr das Crowdsourcing von WasserstĂ€nden und damit den Aufbau neuer und die Erweiterung bestehender Monitoring-Netzwerke anstrebt

    Born-reusable scientific knowledge: Concept, implementation, and applications

    Get PDF
    The exponentially increasing growth of scientific literature publication presents a significant challenge to effectively read, process, and fully comprehend the wealth of scientific knowledge. The Open Research Knowledge Graph (ORKG) aims to address this challenge by providing infrastructure that aligns with the FAIR principles, to support the creation, curation, and utilization of scientific knowledge. Nevertheless, the current dependence on crowdsourcing and natural language processing (NLP) for post-publication knowledge extraction restricts the scalability and quality of such knowledge bases. In response to these challenges, we present a novel ’born-reusable’ approach that seeks to create richly-detailed, machine-reusable descriptions of papers directly within the computing environment where the research was conducted, thus placing the onus on authors to ensure their research findings are FAIR prior to publication. With the help of the ORKG R package, salient scientific knowledge is captured from the paper’s associated R source code and serialized to a machine-reusable format (JSON-LD) for harvesting by the ORKG by DOI-lookup. By applying this approach to an unpublished soil science manuscript, we demonstrated how authors are best situated to describe their work in a richlydetailed machine-reusable format. Furthermore, by applying this approach to two published agroecology papers, we demonstrated its relevance to post-publication, thus suggesting that papers which share source code and data sets could be made machine-reusable retrospectively. Finally, a proof-of-concept meta-analysis was conducted to demonstrate how this approach can help facilitate research synthesis by providing FAIR scientific data. We concluded that the ’born-reusable’ approach has promising implications for the reusability of scientific knowledge. However, its broad adoption faces several challenges. Therefore, solutions were explored to improve the approach’s interoperability with knowledge graphs, assist authors with its implementation into their workflows, and strengthen cooperation with publishers to provide the necessary infrastructure

    Can neural networks do arithmetic? A survey on the elementary numerical skills of state-of-the-art deep learning models

    Full text link
    Creating learning models that can exhibit sophisticated reasoning skills is one of the greatest challenges in deep learning research, and mathematics is rapidly becoming one of the target domains for assessing scientific progress in this direction. In the past few years there has been an explosion of neural network architectures, data sets, and benchmarks specifically designed to tackle mathematical problems, reporting notable success in disparate fields such as automated theorem proving, numerical integration, and discovery of new conjectures or matrix multiplication algorithms. However, despite these impressive achievements it is still unclear whether deep learning models possess an elementary understanding of quantities and symbolic numbers. In this survey we critically examine the recent literature, concluding that even state-of-the-art architectures often fall short when probed with relatively simple tasks designed to test basic numerical and arithmetic knowledge

    Content Creation in the Digital Economy: A Comprehensive Exploration and Investigation of Work Environment and Content Creators’ Behaviours

    Get PDF
    With the emergence and rapid spread of digital technologies, the world is undergoing a profound transformation. The digital economy that has evolved as a result has fundamentally changed and impacted every aspect of society and business, and it will undoubtedly change and reshape employment and work from various perspectives as well. Flexibility and autonomy have always been the strong attraction that the digital economy provides to workers, but behind this hidden truth is the strict control of platforms and algorithms. This thesis seeks to further deepen the understanding of working in the digital economy through a series of studies ranging from the broad to the specific, especially on the work of a particular group of content creators. This thesis contains four studies. Study 1 is a review paper that attempts to clarify the distinction between different concepts from the digital economy on a macro level. Studies 2-4 turn the perspective to a particular group of workers in the digital economy, the content creators. Study 2 uses two quantitative studies to theorise the characteristics of working on content creative platforms by developing a typology of these platforms. The third study was a systematic review to explore the power imbalance between platform algorithms and creators in content creative platforms. The fourth study employs a quantitative study that explores the impact of the platform work environment on the creators' behaviour from an individual perspective. This series of studies makes important theoretical contributions to the field related to employment relations in the digital economy context, especially content creative platforms, from both macro and micro perspectives. In addition, this series of studies provides practical implications for content creators, platforms and policymakers

    The Challenges of Knowledge Combination in ML-based Crowdsourcing – The ODF Killer Shrimp Challenge using ML and Kaggle

    Get PDF
    Organizations are increasingly using digital technologies, such as crowdsourcing platforms and machine learning, to tackle innovation challenges. These technologies often require the combination of heterogeneous technical and domain-specific knowledge from diverse actors to achieve the organization’s innovation goals. While research has focused on knowledge combination for relatively simple tasks on crowdsourcing platforms and within ML-based innovation, we know little about how knowledge is combined in emerging innovation approaches incorporating ML and crowdsourcing to solve domain-specific innovation challenges. Thus, this paper investigates the following: What are the challenges to knowledge combination in domain-specific ML-based crowdsourcing? We conducted a case study of an environmental challenge – how to use ML to predict the spread of a marine invasive species, led by the Swedish consortium, Ocean Data Factory Sweden using the crowdsourcing platform Kaggle. After discussing our results, we end the paper with recommendations on how to integrate crowdsourcing into domain-specific digital innovation processes
    • 

    corecore