2,178 research outputs found

    Data mining as a tool for environmental scientists

    Get PDF
    Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous

    Spatial-Temporal Data Mining for Ocean Science: Data, Methodologies, and Opportunities

    Full text link
    With the increasing amount of spatial-temporal~(ST) ocean data, numerous spatial-temporal data mining (STDM) studies have been conducted to address various oceanic issues, e.g., climate forecasting and disaster warning. Compared with typical ST data (e.g., traffic data), ST ocean data is more complicated with some unique characteristics, e.g., diverse regionality and high sparsity. These characteristics make it difficult to design and train STDM models. Unfortunately, an overview of these studies is still missing, hindering computer scientists to identify the research issues in ocean while discouraging researchers in ocean science from applying advanced STDM techniques. To remedy this situation, we provide a comprehensive survey to summarize existing STDM studies in ocean. Concretely, we first summarize the widely-used ST ocean datasets and identify their unique characteristics. Then, typical ST ocean data quality enhancement techniques are discussed. Next, we classify existing STDM studies for ocean into four types of tasks, i.e., prediction, event detection, pattern mining, and anomaly detection, and elaborate the techniques for these tasks. Finally, promising research opportunities are highlighted. This survey will help scientists from the fields of both computer science and ocean science have a better understanding of the fundamental concepts, key techniques, and open challenges of STDM in ocean

    Enhancing Exploratory Analysis across Multiple Levels of Detail of Spatiotemporal Events

    Get PDF
    Crimes, forest fires, accidents, infectious diseases, human interactions with mobile devices (e.g., tweets) are being logged as spatiotemporal events. For each event, its spatial location, time and related attributes are known with high levels of detail (LoDs). The LoD of analysis plays a crucial role in the user’s perception of phenomena. From one LoD to another, some patterns can be easily perceived or different patterns may be detected, thus requiring modeling phenomena at different LoDs as there is no exclusive LoD to study them. Granular computing emerged as a paradigm of knowledge representation and processing, where granules are basic ingredients of information. These can be arranged in a hierarchical alike structure, allowing the same phenomenon to be perceived at different LoDs. This PhD Thesis introduces a formal Theory of Granularities (ToG) in order to have granules defined over any domain and reason over them. This approach is more general than the related literature because these appear as particular cases of the proposed ToG. Based on this theory we propose a granular computing approach to model spatiotemporal phenomena at multiple LoDs, and called it a granularities-based model. This approach stands out from the related literature because it models a phenomenon through statements rather than just using granules to model abstract real-world entities. Furthermore, it formalizes the concept of LoD and follows an automated approach to generalize a phenomenon from one LoD to a coarser one. Present-day practices work on a single LoD driven by the users despite the fact that the identification of the suitable LoDs is a key issue for them. This PhD Thesis presents a framework for SUmmarizIng spatioTemporal Events (SUITE) across multiple LoDs. The SUITE framework makes no assumptions about the phenomenon and the analytical task. A Visual Analytics approach implementing the SUITE framework is presented, which allow users to inspect a phenomenon across multiple LoDs, simultaneously, thus helping to understand in what LoDs the phenomenon perception is different or in what LoDs patterns emerge

    Self-organizing map algorithm for assessing spatial and temporal patterns of pollutants in environmental compartments: A review

    Get PDF
    The evaluation of the spatial and temporal distribution of pollutants is a crucial issue to assess the anthropogenic burden on the environment. Numerous chemometric approaches are available for data exploration and they have been applied for environmental health assessment purposes. Among the unsupervised methods, Self-Organizing Map (SOM) is an artificial neural network able to handle non-linear problems that can be used for exploratory data analysis, pattern recognition, and variable relationship assessment. Much more interpretation ability is gained when the SOMbased model is merged with clustering algorithms. This review comprises: (i) a description of the algorithm operation principle with a focus on the key parameters used for the SOM initialization; (ii) a description of the SOM output features and how they can be used for data mining; (iii) a list of available software tools for performing calculations; (iv) an overview of the SOM application for obtaining spatial and temporal pollution patterns in the environmental compartments with focus on model training and result visualization; (v) advice on reporting SOM model details in a pape

    Integrating Data Science and Earth Science

    Get PDF
    This open access book presents the results of three years collaboration between earth scientists and data scientist, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows

    Geospatial big data and cartography : research challenges and opportunities for making maps that matter

    Get PDF
    Geospatial big data present a new set of challenges and opportunities for cartographic researchers in technical, methodological, and artistic realms. New computational and technical paradigms for cartography are accompanying the rise of geospatial big data. Additionally, the art and science of cartography needs to focus its contemporary efforts on work that connects to outside disciplines and is grounded in problems that are important to humankind and its sustainability. Following the development of position papers and a collaborative workshop to craft consensus around key topics, this article presents a new cartographic research agenda focused on making maps that matter using geospatial big data. This agenda provides both long-term challenges that require significant attention as well as short-term opportunities that we believe could be addressed in more concentrated studies.PostprintPeer reviewe
    corecore