472 research outputs found
Multi-source imagery fusion using deep learning in a cloud computing platform
Given the high availability of data collected by different remote sensing
instruments, the data fusion of multi-spectral and hyperspectral images (HSI)
is an important topic in remote sensing. In particular, super-resolution as a
data fusion application using spatial and spectral domains is highly
investigated because its fused images is used to improve the classification and
tracking objects accuracy. On the other hand, the huge amount of data obtained
by remote sensing instruments represent a key concern in terms of data storage,
management and pre-processing. This paper proposes a Big Data Cloud platform
using Hadoop and Spark to store, manages, and process remote sensing data.
Also, a study over the parameter \textit{chunk size} is presented to suggest
the appropriate value for this parameter to download imagery data from Hadoop
into a Spark application, based on the format of our data. We also developed an
alternative approach based on Long Short Term Memory trained with different
patch sizes for super-resolution image. This approach fuse hyperspectral and
multispectral images. As a result, we obtain images with high-spatial and
high-spectral resolution. The experimental results show that for a chunk size
of 64k, an average of 3.5s was required to download data from Hadoop into a
Spark application. The proposed model for super-resolution provides a
structural similarity index of 0.98 and 0.907 for the used dataset
Distributed processing of large remote sensing images using MapReduce - A case of Edge Detection
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.Advances in sensor technology and their ever increasing repositories of the collected data are revolutionizing the mechanisms remotely sensed data are collected, stored and processed. This exponential growth of data archives and the increasing user’s demand for real-and near-real time remote sensing data products has pressurized remote sensing service providers to deliver the required services. The remote sensing community has recognized the challenge in processing large and complex satellite datasets to derive customized products. To address this high demand in computational resources, several efforts have been made in the past few years towards incorporation of high-performance computing models in remote sensing data collection, management and analysis. This study adds an impetus to these efforts by introducing the recent advancements in distributed computing technologies, MapReduce programming paradigm, to the area of remote sensing. The MapReduce model which is developed by Google Inc. encapsulates the efforts of distributed computing in a highly simplified single library. This simple but powerful programming model can provide us distributed environment without having deep knowledge of parallel programming. This thesis presents a MapReduce based processing of large satellite images a use case scenario of edge detection methods. Deriving from the conceptual massive remote sensing image processing applications, a prototype of edge detection methods was implemented on MapReduce framework using its open-source implementation, the Apache Hadoop environment. The experiences of the implementation of the MapReduce model of Sobel, Laplacian, and Canny edge detection methods are presented. This thesis also presents the results of the evaluation the effect of parallelization using MapReduce on the quality of the output and the execution time performance tests conducted based on various performance metrics. The MapReduce algorithms were executed on a test environment on heterogeneous cluster that supports the Apache Hadoop open-source software. The successful implementation of the MapReduce algorithms on a distributed environment demonstrates that MapReduce has a great potential for scaling large-scale remotely sensed images processing and perform more complex geospatial problems
High-throughput visual knowledge analysis and retrieval in big data ecosystems
Visual knowledge plays an important role in many highly skilled applications, such as medical diagnosis, geospatial image analysis and pathology diagnosis. Medical practitioners are able to interpret and reason about diagnostic images based on not only primitive-level image features such as color, texture, and spatial distribution but also their experience and tacit knowledge which are seldom articulated explicitly. This reasoning process is dynamic and closely related to real-time human cognition. Due to a lack of visual knowledge management and sharing tools, it is difficult to capture and transfer such tacit and hard-won expertise to novices. Moreover, many mission-critical applications require the ability to process such tacit visual knowledge in real time. Precisely how to index this visual knowledge computationally and systematically still poses a challenge to the computing community. My dissertation research results in novel computational approaches for high-throughput visual knowledge analysis and retrieval from large-scale databases using latest technologies in big data ecosystems. To provide a better understanding of visual reasoning, human gaze patterns are qualitatively measured spatially and temporally to model observers' cognitive process. These gaze patterns are then indexed in a NoSQL distributed database as a visual knowledge repository, which is accessed using various unique retrieval methods developed through this dissertation work. To provide meaningful retrievals in real time, deep-learning methods for automatic annotation of visual activities and streaming similarity comparisons are developed under a gaze-streaming framework using Apache Spark. This research has several potential applications that offer a broader impact among the scientific community and in the practical world. First, the proposed framework can be adapted for different domains, such as fine arts, life sciences, etc. with minimal effort to capture human reasoning processes. Second, with its real-time visual knowledge search function, this framework can be used for training novices in the interpretation of domain images, by helping them learn experts' reasoning processes. Third, by helping researchers to understand human visual reasoning, it may shed light on human semantics modeling. Finally, integrating reasoning process with multimedia data, future retrieval of media could embed human perceptual reasoning for database search beyond traditional content-based media retrievals
Marble Slabs Classification System Based on Image Processing (Ark Marble Mine in Birjand)
Marble is one of the semi-precious stones that has been used in decorating building façade and making decorative things. This stone is present in the nature in the form of rock or layered stone. Examining the kind of stone, extent of impurity and different streaks in white marble is a widely confronted subject by those who are involved in this industry. Obtaining the extent of impurity of white marble using methods of detecting and analyzing material is expensive and time-consuming. In this research carried out on while marbles of Arc Mine in Birjand, it has been attempted to present very fast method using Image Processing Techniques so that while preserving identity and appearance of stone and without any damage to it, we compute the impurity level and different streaks on white marble surface. The proposed method includes two stages; in the first stage applying image processing functions, it is attempted to segment the present impurities and streaks on marble surface from the stone background and in the second stage, the area of these impurities and streaks is computed. Results obtained in this paper (97.8%) in comparison with other researches and experimental methods indicate acceptability of this algorithm
Geospatial Data Management Research: Progress and Future Directions
Without geospatial data management, today´s challenges in big data applications such as earth observation, geographic information system/building information modeling (GIS/BIM) integration, and 3D/4D city planning cannot be solved. Furthermore, geospatial data management plays a connecting role between data acquisition, data modelling, data visualization, and data analysis. It enables the continuous availability of geospatial data and the replicability of geospatial data analysis. In the first part of this article, five milestones of geospatial data management research are presented that were achieved during the last decade. The first one reflects advancements in BIM/GIS integration at data, process, and application levels. The second milestone presents theoretical progress by introducing topology as a key concept of geospatial data management. In the third milestone, 3D/4D geospatial data management is described as a key concept for city modelling, including subsurface models. Progress in modelling and visualization of massive geospatial features on web platforms is the fourth milestone which includes discrete global grid systems as an alternative geospatial reference framework. The intensive use of geosensor data sources is the fifth milestone which opens the way to parallel data storage platforms supporting data analysis on geosensors. In the second part of this article, five future directions of geospatial data management research are presented that have the potential to become key research fields of geospatial data management in the next decade. Geo-data science will have the task to extract knowledge from unstructured and structured geospatial data and to bridge the gap between modern information technology concepts and the geo-related sciences. Topology is presented as a powerful and general concept to analyze GIS and BIM data structures and spatial relations that will be of great importance in emerging applications such as smart cities and digital twins. Data-streaming libraries and “in-situ” geo-computing on objects executed directly on the sensors will revolutionize geo-information science and bridge geo-computing with geospatial data management. Advanced geospatial data visualization on web platforms will enable the representation of dynamically changing geospatial features or moving objects’ trajectories. Finally, geospatial data management will support big geospatial data analysis, and graph databases are expected to experience a revival on top of parallel and distributed data stores supporting big geospatial data analysis
Recommended from our members
High performance latent dirichlet allocation for text mining
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Latent Dirichlet Allocation (LDA), a total probability generative model, is a three-tier Bayesian model. LDA computes the latent topic structure of the data and obtains the significant information of documents. However, traditional LDA has several limitations in practical applications. LDA cannot be directly used in classification because it is a non-supervised learning model. It needs to be embedded into appropriate classification algorithms. LDA is a generative model as it normally generates the latent topics in the categories where the target documents do not belong to, producing the deviation in computation and reducing the classification accuracy. The number of topics in LDA influences the learning process of model parameters greatly. Noise samples in the training data also affect the final text classification result. And, the quality of LDA based classifiers depends on the quality of the training samples to a great extent. Although parallel LDA algorithms are proposed to deal with huge amounts of data, balancing computing loads in a computer cluster poses another challenge. This thesis presents a text classification method which combines the LDA model and Support Vector Machine (SVM) classification algorithm for an improved accuracy in classification when reducing the dimension of datasets. Based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN), the algorithm automatically optimizes the number of topics to be selected which reduces the number of iterations in computation. Furthermore, this thesis presents a noise data reduction scheme to process noise data. When the noise ratio is large in the training data set, the noise reduction scheme can always produce a high level of accuracy in classification. Finally, the thesis parallelizes LDA using the MapReduce model which is the de facto computing standard in supporting data intensive applications. A genetic algorithm based load balancing algorithm is designed to balance the workloads among computers in a heterogeneous MapReduce cluster where the computers have a variety of computing resources in terms of CPU speed, memory space and hard disk space
New Methods to Improve Large-Scale Microscopy Image Analysis with Prior Knowledge and Uncertainty
Multidimensional imaging techniques provide powerful ways to examine various kinds of scientific questions. The routinely produced data sets in the terabyte-range, however, can hardly be analyzed manually and require an extensive use of automated image analysis. The present work introduces a new concept for the estimation and propagation of uncertainty involved in image analysis operators and new segmentation algorithms that are suitable for terabyte-scale analyses of 3D+t microscopy images
Geospatial Information Research: State of the Art, Case Studies and Future Perspectives
Geospatial information science (GI science) is concerned with the development and application of geodetic and information science methods for modeling, acquiring, sharing, managing, exploring, analyzing, synthesizing, visualizing, and evaluating data on spatio-temporal phenomena related to the Earth. As an interdisciplinary scientific discipline, it focuses on developing and adapting information technologies to understand processes on the Earth and human-place interactions, to detect and predict trends and patterns in the observed data, and to support decision making. The authors – members of DGK, the Geoinformatics division, as part of the Committee on Geodesy of the Bavarian Academy of Sciences and Humanities, representing geodetic research and university teaching in Germany – have prepared this paper as a means to point out future research questions and directions in geospatial information science. For the different facets of geospatial information science, the state of art is presented and underlined with mostly own case studies. The paper thus illustrates which contributions the German GI community makes and which research perspectives arise in geospatial information science. The paper further demonstrates that GI science, with its expertise in data acquisition and interpretation, information modeling and management, integration, decision support, visualization, and dissemination, can help solve many of the grand challenges facing society today and in the future
- …