21 research outputs found
Spatio-Temporal Multiway Data Decomposition Using Principal Tensor Analysis on k-Modes: The R Package PTAk
The purpose of this paper is to describe the R package {PTAk and how the spatio-temporal context can be taken into account in the analyses. Essentially PTAk() is a multiway multidimensional method to decompose a multi-entries data-array, seen mathematically as a tensor of any order. This PTAk-modes method proposes a way of generalizing SVD (singular value decomposition), as well as some other well known methods included in the R package, such as PARAFAC or CANDECOMP and the PCAn-modes or Tucker-n model. The example datasets cover different domains with various spatio-temporal characteristics and issues: (i)~medical imaging in neuropsychology with a functional MRI (magnetic resonance imaging) study, (ii)~pharmaceutical research with a pharmacodynamic study with EEG (electro-encephaloegraphic) data for a central nervous system (CNS) drug, and (iii)~geographical information system (GIS) with a climatic dataset that characterizes arid and semi-arid variations. All the methods implemented in the R package PTAk also support non-identity metrics, as well as penalizations during the optimization process. As a result of these flexibilities, together with pre-processing facilities, PTAk constitutes a framework for devising extensions of multidimensional methods such ascorrespondence analysis, discriminant analysis, and multidimensional scaling, also enabling spatio-temporal constraints.
Spatio-Temporal Multiway Data Decomposition Using Principal Tensor Analysis on k-Modes: The R Package PTAk
The purpose of this paper is to describe the <b>R</b> package {<b>PTAk</b> and how the spatio-temporal context can be taken into account in the analyses. Essentially PTAk() is a multiway multidimensional method to decompose a multi-entries data-array, seen mathematically as a tensor of any order. This PTAk-modes method proposes a way of generalizing SVD (singular value decomposition), as well as some other well known methods included in the <b>R</b> package, such as PARAFAC or CANDECOMP and the PCAn-modes or Tucker-n model. The example datasets cover different domains with various spatio-temporal characteristics and issues: (i)~medical imaging in neuropsychology with a functional MRI (magnetic resonance imaging) study, (ii)~pharmaceutical research with a pharmacodynamic study with EEG (electro-encephaloegraphic) data for a central nervous system (CNS) drug, and (iii)~geographical information system (GIS) with a climatic dataset that characterizes arid and semi-arid variations. All the methods implemented in the <b>R</b> package <b>PTAk</b> also support non-identity metrics, as well as penalizations during the optimization process. As a result of these flexibilities, together with pre-processing facilities, <b>PTAk</b> constitutes a framework for devising extensions of multidimensional methods such ascorrespondence analysis, discriminant analysis, and multidimensional scaling, also enabling spatio-temporal constraints
A flexible framework for assessing the quality of crowdsourced data
Ponencias, comunicaciones y pĂłsters presentados en el 17th AGILE Conference on Geographic Information Science
"Connecting a Digital Europe through Location and Place", celebrado en la Universitat Jaume I del 3 al 6 de junio de 2014.Crowdsourcing as a means of data collection has produced previously unavailable data assets and enriched existing ones, but its quality can be highly variable. This presents several challenges to potential end users that are concerned with the validation and quality assurance of the data collected. Being able to quantify the uncertainty, define and measure the different quality elements associated with crowdsourced data, and introduce means for dynamically assessing and improving it is the focus of this paper. We argue that the required quality assurance and quality control is dependent on the studied domain, the style of crowdsourcing and the goals of the study. We describe a framework for qualifying geolocated data collected from non-authoritative sources that enables assessment for specific case studies by creating a workflow supported by an ontological description of a range of choices. The top levels of this ontology describe seven pillars of quality checks and assessments that present a range of techniques to qualify, improve or reject data. Our generic operational framework allows for extension of this ontology to specific applied domains. This will facilitate quality assurance in real-time or for post-processing to validate data and produce quality metadata. It enables a system that dynamically optimises the usability value of the data captured. A case study illustrates this framework
Local and global spatio-temporal entropy indices based on distance- ratios and co-occurrences distributions
When it comes to characterize the distribution of âthingsâ observed spatially and identified by their geometries and attributes, the Shannon entropy has been widely used in different domains such as ecology, regional sciences, epidemiology and image analysis. In particular, recent research has taken into account the spatial patterns derived from topological and metric properties in order to propose extensions to the measure of entropy. Based on two different approaches using either distance-ratios or co-occurrences of observed classes, the research developed in this paper introduces several new indices and explores their extensions to the spatio-temporal domains which are derived whilst investigating further their application as global and local indices. Using a multiplicative space-time integration approach either at a macro or micro-level, the approach leads to a series of spatio-temporal entropy indices including from combining co-occurrence and distances-ratios approaches. The framework developed is complementary to the spatio-temporal clustering problem, introducing a more spatial and spatio-temporal structuring perspective using several indices characterizing the distribution of several class instances in space and time. The whole approach is first illustrated on simulated data evolutions of three classes over seven time stamps. Preliminary results are discussed for a study of conflicting maritime activities in the Bay of Brest where the objective is to explore the spatio-temporal patterns exhibited by a categorical variable with six classes, each representing a conflict between two maritime activities
Creative collaboration in citizen science and the evolution of ThinkCamps
This chapter discusses how to harness the potential of creative collaboration through ThinkCamp events â an âunconferenceâ style event with an open and creative environment designed to foster co-creation, co-design and collaborative thinking at key points in the citizen science research cycle. It draws on the authorsâ experiences of running (and participating in) creative collaborative events and explores their potential to support inclusive, co-creational approaches to citizen science. Finally, it makes specific recommendations for project initiators, event organisers and policymakers
Earth observation for citizen science validation, or citizen science for earth observation validation? The role of quality assurance of volunteered observations
Environmental policy involving citizen science (CS) is of growing interest. In support of this open data stream of information, validation or quality assessment of the CS geo-located data to their appropriate usage for evidence-based policy making needs a flexible and easily adaptable data curation process ensuring transparency. Addressing these needs, this paper describes an approach for automatic quality assurance as proposed by the Citizen OBservatory WEB (COBWEB) FP7 project. This approach is based upon a workflow composition that combines different quality controls, each belonging to seven categories or âpillarsâ. Each pillar focuses on a specific dimension in the types of reasoning algorithms for CS data qualification. These pillars attribute values to a range of quality elements belonging to three complementary quality models. Additional data from various sources, such as Earth Observation (EO) data, are often included as part of the inputs of quality controls within the pillars. However, qualified CS data can also contribute to the validation of EO data. Therefore, the question of validation can be considered as âtwo sides of the same coinâ. Based on an invasive species CS study, concerning Fallopia japonica (Japanese knotweed), the paper discusses the flexibility and usefulness of qualifying CS data, either when using an EO data product for the validation within the quality assurance process, or validating an EO data product that describes the risk of occurrence of the plant. Both validation paths are found to be improved by quality assurance of the CS data. Addressing the reliability of CS open data, issues and limitations of the role of quality assurance for validation, due to the quality of secondary data used within the automatic workflow, are described, e.g., error propagation, paving the route to improvements in the approach
Citizen OBservatory WEB (COBWEB): A Generic Infrastructure Platform to Facilitate the Collection of Citizen Science Data for Environmental Monitoring
COBWEB has used the UNESCO World Network of Biosphere Reserves as a testbed for researching and developing a generic crowdsourcing infrastructure platform for environmental monitoring. A major challenge is dealing with what is necessarily a complex problem requiring sophisticated solutions balanced with the need to present sometimes unsophisticated users with comprehensible and useable software. The components of the COBWEB platform are at different Technology Readiness Levels. This short paper outlines the overall solution and points to quality assurance, standardisation and semantic interoperability as key areas requiring further attention
Citizen OBservatory WEB (COBWEB): A Generic Infrastructure Platform to Facilitate the Collection of Citizen Science data for Environmental Monitoring
The mass uptake of internet connected, GPS enabled mobile devices has resulted in a surge of citizens active in making a huge variety of environmental observations. The use and reuse potential of these data is significant but currently compromised by a lack of interoperability. Useable standards either donât exist, are neglected, poorly understood or tooling is unavailable. Large volumes of data are being created but exist in silos. This is a complex problem requiring sophisticated solutions balanced with the need to present sometimes unsophisticated users with comprehensible and useable software. COBWEB has addressed this challenge by using the UNESCO World Network of Biosphere Reserves as a testbed for researching and developing a generic crowdsourcing infrastructure platform for environmental monitoring.  The solution arrived at provides tools for the creation of mobile Applications which generate data compliant with open interoperability standards and facilitate integration with Spatial Data Infrastructures. COBWEB is a research project and the components of the COBWEB platform are at different Technology Readiness Levels. This paper outlines how the overall solution was arrived at, describes the main components developed and points to quality assurance, integration of sensors, interoperability and associated standardisation as key areas requiring further attention.
On Integrating Size and Shape Distributions into a Spatio-Temporal Information Entropy Framework
Understanding the structuration of spatio-temporal information is a common endeavour to many disciplines and application domains, e.g., geography, ecology, urban planning, epidemiology. Revealing the processes involved, in relation to one or more phenomena, is often the first step before elaborating spatial functioning theories and specific planning actions, e.g., epidemiological modelling, urban planning. To do so, the spatio-temporal distributions of meaningful variables from a decision-making viewpoint, can be explored, analysed separately or jointly from an information viewpoint. Using metrics based on the measure of entropy has a long practice in these domains with the aim of quantification of how uniform the distributions are. However, the level of embedding of the spatio-temporal dimension in the metrics used is often minimal. This paper borrows from the landscape ecology concept of patch size distribution and the approach of permutation entropy used in biomedical signal processing to derive a spatio-temporal entropy analysis framework for categorical variables. The framework is based on a spatio-temporal structuration of the information allowing to use a decomposition of the Shannon entropy which can also embrace some existing spatial or temporal entropy indices to reinforce the spatio-temporal structuration. Multiway correspondence analysis is coupled to the decomposition entropy to propose further decomposition and entropy quantification of the spatio-temporal structuring information. The flexibility from these different choices, including geographic scales, allows for a range of domains to take into account domain specifics of the data; some of which are explored on a dataset linked to climate change and evolution of land cover types in Nordic areas