2,178 research outputs found
Data mining as a tool for environmental scientists
Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous
Spatial-Temporal Data Mining for Ocean Science: Data, Methodologies, and Opportunities
With the increasing amount of spatial-temporal~(ST) ocean data, numerous
spatial-temporal data mining (STDM) studies have been conducted to address
various oceanic issues, e.g., climate forecasting and disaster warning.
Compared with typical ST data (e.g., traffic data), ST ocean data is more
complicated with some unique characteristics, e.g., diverse regionality and
high sparsity. These characteristics make it difficult to design and train STDM
models. Unfortunately, an overview of these studies is still missing, hindering
computer scientists to identify the research issues in ocean while discouraging
researchers in ocean science from applying advanced STDM techniques. To remedy
this situation, we provide a comprehensive survey to summarize existing STDM
studies in ocean. Concretely, we first summarize the widely-used ST ocean
datasets and identify their unique characteristics. Then, typical ST ocean data
quality enhancement techniques are discussed. Next, we classify existing STDM
studies for ocean into four types of tasks, i.e., prediction, event detection,
pattern mining, and anomaly detection, and elaborate the techniques for these
tasks. Finally, promising research opportunities are highlighted. This survey
will help scientists from the fields of both computer science and ocean science
have a better understanding of the fundamental concepts, key techniques, and
open challenges of STDM in ocean
Enhancing Exploratory Analysis across Multiple Levels of Detail of Spatiotemporal Events
Crimes, forest fires, accidents, infectious diseases, human interactions with mobile devices (e.g., tweets) are being logged as spatiotemporal events. For each event, its spatial location, time and related attributes are known with high levels of detail (LoDs). The LoD of analysis plays a crucial role in the user’s perception of phenomena. From one LoD to another, some patterns can be easily perceived or different patterns may be detected, thus requiring modeling phenomena at different LoDs as there is no exclusive LoD to study them.
Granular computing emerged as a paradigm of knowledge representation and processing, where granules are basic ingredients of information. These can be arranged in a hierarchical alike structure, allowing the same phenomenon to be perceived at different LoDs. This PhD Thesis introduces a formal Theory of Granularities (ToG) in order to have granules defined over any domain and reason over them. This approach is more general than the related literature because these appear as particular cases of the proposed ToG. Based on this theory we propose a granular computing approach to model spatiotemporal phenomena at multiple LoDs, and called it a granularities-based model.
This approach stands out from the related literature because it models a phenomenon
through statements rather than just using granules to model abstract real-world entities.
Furthermore, it formalizes the concept of LoD and follows an automated approach to
generalize a phenomenon from one LoD to a coarser one.
Present-day practices work on a single LoD driven by the users despite the fact that
the identification of the suitable LoDs is a key issue for them. This PhD Thesis presents a framework for SUmmarizIng spatioTemporal Events (SUITE) across multiple LoDs. The SUITE framework makes no assumptions about the phenomenon and the analytical task.
A Visual Analytics approach implementing the SUITE framework is presented, which
allow users to inspect a phenomenon across multiple LoDs, simultaneously, thus helping to understand in what LoDs the phenomenon perception is different or in what LoDs patterns emerge
Self-organizing map algorithm for assessing spatial and temporal patterns of pollutants in environmental compartments: A review
The evaluation of the spatial and temporal distribution of pollutants is a crucial issue to assess the anthropogenic burden on the environment. Numerous chemometric approaches are available for data exploration and they have been
applied for environmental health assessment purposes. Among the unsupervised methods, Self-Organizing Map
(SOM) is an artificial neural network able to handle non-linear problems that can be used for exploratory data analysis,
pattern recognition, and variable relationship assessment. Much more interpretation ability is gained when the SOMbased model is merged with clustering algorithms. This review comprises: (i) a description of the algorithm operation
principle with a focus on the key parameters used for the SOM initialization; (ii) a description of the SOM output features and how they can be used for data mining; (iii) a list of available software tools for performing calculations; (iv)
an overview of the SOM application for obtaining spatial and temporal pollution patterns in the environmental compartments with focus on model training and result visualization; (v) advice on reporting SOM model details in a pape
Integrating Data Science and Earth Science
This open access book presents the results of three years collaboration between earth scientists and data scientists, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows
Integrating Data Science and Earth Science
This open access book presents the results of three years collaboration between earth scientists and data scientist, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows
Geospatial big data and cartography : research challenges and opportunities for making maps that matter
Geospatial big data present a new set of challenges and opportunities for cartographic researchers in technical, methodological, and artistic realms. New computational and technical paradigms for cartography are accompanying the rise of geospatial big data. Additionally, the art and science of cartography needs to focus its contemporary efforts on work that connects to outside disciplines and is grounded in problems that are important to humankind and its sustainability. Following the development of position papers and a collaborative workshop to craft consensus around key topics, this article presents a new cartographic research agenda focused on making maps that matter using geospatial big data. This agenda provides both long-term challenges that require significant attention as well as short-term opportunities that we believe could be addressed in more concentrated studies.PostprintPeer reviewe
- …