37 research outputs found

    A HADOOP-BASED ALGORITHM OF GENERATING DEM GRID FROM POINT CLOUD DATA

    Get PDF

    Blending big data analytics : review on challenges and a recent study

    Get PDF
    With the collection of massive amounts of data every day, big data analytics has emerged as an important trend for many organizations. These collected data can contain important information that may be key to solving wide-ranging problems, such as cyber security, marketing, healthcare, and fraud. To analyze their large volumes of data for business analyses and decisions, large companies, such as Facebook and Google, adopt analytics. Such analyses and decisions impact existing and future technology. In this paper, we explore how big data analytics is utilized as a technique for solving problems of complex and unstructured data using such technologies as Hadoop, Spark, and MapReduce. We also discuss the data challenges introduced by big data according to the literature, including its six V's. Moreover, we investigate case studies of big data analytics on various techniques of such analytics, namely, text, voice, video, and network analytics. We conclude that big data analytics can bring positive changes in many fields, such as education, military, healthcare, politics, business, agriculture, banking, and marketing, in the future. © 2013 IEEE

    Geospatial Semantics

    Full text link
    Geospatial semantics is a broad field that involves a variety of research areas. The term semantics refers to the meaning of things, and is in contrast with the term syntactics. Accordingly, studies on geospatial semantics usually focus on understanding the meaning of geographic entities as well as their counterparts in the cognitive and digital world, such as cognitive geographic concepts and digital gazetteers. Geospatial semantics can also facilitate the design of geographic information systems (GIS) by enhancing the interoperability of distributed systems and developing more intelligent interfaces for user interactions. During the past years, a lot of research has been conducted, approaching geospatial semantics from different perspectives, using a variety of methods, and targeting different problems. Meanwhile, the arrival of big geo data, especially the large amount of unstructured text data on the Web, and the fast development of natural language processing methods enable new research directions in geospatial semantics. This chapter, therefore, provides a systematic review on the existing geospatial semantic research. Six major research areas are identified and discussed, including semantic interoperability, digital gazetteers, geographic information retrieval, geospatial Semantic Web, place semantics, and cognitive geographic concepts.Comment: Yingjie Hu (2017). Geospatial Semantics. In Bo Huang, Thomas J. Cova, and Ming-Hsiang Tsou et al. (Eds): Comprehensive Geographic Information Systems, Elsevier. Oxford, U

    Citizen science characterization of meanings of toponyms of Kenya: a shared heritage

    Get PDF
    This paper examines the toponymic heritage used in Kenya’s Authoritative Geographic Information (AGI) toponyms database of 26,600 gazetteer records through documentation and characterization of meanings of place names in topographic mapping. A comparison was carried out between AGI and GeoNames and between AGI and OpenStreetMap (OSM) volunteered records. A total of 15,000 toponymic matchings were found. Out of these, 1567 toponyms were then extracted for further scrutiny using AGI data in the historical records and from respondents on toponyms’ meanings. Experts in toponymy assisted in verifying these data. From the questionnaire responses, 235 names occurred in more than one place while AGI data had 284. The elements used to characterize the toponyms included historical perceptions of heritage evident in toponyms in their localities, ethnographic, toponymical and morphology studies on Kenya's dialects. There was no significant relationship established between the same place name usages among dialects as indicated by a positive weak correlation r (438), = 0.166, p < 0.001 based on the effect of using the related places and the distance between related places. The weak correlation implies that the one name one place principle does not apply due to diverse language boundaries, strong bonds associated with historical toponyms in the form of heritage and significant variations on how names resist changes to preserve their heritage

    Reflecting Human Knowledge of Place and Route-Choice Behavior Using Big Data

    Get PDF
    Exploring human knowledge of geographical space and related behavior not only helps in understanding human-environment interactions and dynamic geographic processes, but also advances Geographic Information Systems (GIS) toward a human-centric paradigm to make daily life more efficient. Today’s relatively easy acquisition of various big data provides an unprecedented opportunity for geographers to answer research questions that previously could not be adequately addressed. However, new challenges also arise regarding data quality and bias as well as change in methodology for dealing with big data that are different from traditional data types. Representing people’s perception of place and studying driver’s route-choice behavior are two of the many applications of big data in answering research questions about human knowledge and behavior in the fields of GIS and transportation. Incorporating three papers, this dissertation focuses on these two different applications to achieve the following objectives: 1) examine the degree to which a geographic place’s spatial extent can be estimated from human-generated geotagged photos; 2) address the challenge of geotagged photos’ uneven spatial distribution in place estimation and explore an approach that can better derive a place’s spatial extent; 3) develop a method that can properly estimate the spatial extent of a place that has multiple disjoint regions while considering geotagged photos’ uneven distribution; 4) explore useful spatiotemporal patterns of taxi drivers’ route-choice behavior in a dynamic urban environment. This dissertation makes three major contributions to big data applications’ systematic theory: 1) proposes an effective approach to handling the uneven spatial distribution problem of geotagged photos as a type of volunteered geographic data by modeling their representativeness; 2) develops methods that can properly derive the vague spatial extent of a place with or without disjoint regions; and 3) explores taxi drivers’ route-choice patterns in different situations that can inform future transportation decisions and policy-making processes

    A COMPARATIVE ANALYSIS OF CONVENTIONAL HADOOP WITH PROPOSED CLOUD ENABLED HADOOP FRAMEWORK FOR SPATIAL BIG DATA PROCESSING

    Get PDF
    The emergence of new tools and technologies to gather the information generate the problem of processing spatial big data. The solution of this problem requires new research, techniques, innovation and development. Spatial big data is categorized by the five V’s: volume, velocity, veracity, variety and value. Hadoop is a most widely used framework which address these problems. But it requires high performance computing resources to store and process such huge data. The emergence of cloud computing has provided, on demand, elastic, scalable and payment based computing resources to users to develop their own computing environment. The main objective of this paper is to develop a cloud enabled hadoop framework which combines cloud technology and high computing resources with the conventional hadoop framework to support the spatial big data solutions. The paper also compares the conventional hadoop framework and proposed cloud enabled hadoop framework. It is observed that the propose cloud enabled hadoop framework is much efficient to spatial big data processing than the current available solutions

    A Study of Colloquial Place Names through Geotagged Social Media Data

    Get PDF
    Place is a rich but vague geographic concept. Much work has been done to explore the collective understanding and perceived location of place. The last few decades have seen rapid expansion in the use of online social media and data sharing services, which provide a large amount of valuable data for research of colloquial place names. This study explored how geotagged social media data can be used to understand geographic place names, and delimit the perceived geographic extent of a place. The author proposes a probabilistic method to map the perceived geographic extent of a place using Kernel Density Estimation (KDE) based on the geotagged data uploaded by users. The author also used spatio-temporal analysis methods in GIS to explore characteristics, hidden patterns, and trends of the places. Flickr, a popular online social networking service that features image hosting and sharing, was selected as the main data source for this project. The results show that outcomes of KDE with different functions and parameters differ from each other; therefore, it is crucial to select the proper KDE bandwidth in order to obtain appropriate geographic extents. Official boundaries and reference boundaries can be used to assess the geographic extents. Google Maps Street View is another useful source to examine the visual characteristics of places. Spatio-temporal analysis of the geographic extents over time reveals significant location changes of the places composed of man-made structures. Besides names and variations of place names, related colloquial terms, like Cades Cove of the Great Smoky Mountains National Park, are also useful sources when delimiting a place. Several examples are analyzed and discussed. Studies like this research can improve our understanding of geotagged Online Social Network (OSN) data in the study of colloquial place names as well as provide a temporal perspective to the analysis of their perceived geographic extents

    Performance-Aware High-Performance Computing for Remote Sensing Big Data Analytics

    Get PDF
    The incredible increase in the volume of data emerging along with recent technological developments has made the analysis processes which use traditional approaches more difficult for many organizations. Especially applications involving subjects that require timely processing and big data such as satellite imagery, sensor data, bank operations, web servers, and social networks require efficient mechanisms for collecting, storing, processing, and analyzing these data. At this point, big data analytics, which contains data mining, machine learning, statistics, and similar techniques, comes to the help of organizations for end-to-end managing of the data. In this chapter, we introduce a novel high-performance computing system on the geo-distributed private cloud for remote sensing applications, which takes advantages of network topology, exploits utilization and workloads of CPU, storage, and memory resources in a distributed fashion, and optimizes resource allocation for realizing big data analytics efficiently

    Automatic Scaling Hadoop in the Cloud for Efficient Process of Big Geospatial Data

    Get PDF
    Efficient processing of big geospatial data is crucial for tackling global and regional challenges such as climate change and natural disasters, but it is challenging not only due to the massive data volume but also due to the intrinsic complexity and high dimensions of the geospatial datasets. While traditional computing infrastructure does not scale well with the rapidly increasing data volume, Hadoop has attracted increasing attention in geoscience communities for handling big geospatial data. Recently, many studies were carried out to investigate adopting Hadoop for processing big geospatial data, but how to adjust the computing resources to efficiently handle the dynamic geoprocessing workload was barely explored. To bridge this gap, we propose a novel framework to automatically scale the Hadoop cluster in the cloud environment to allocate the right amount of computing resources based on the dynamic geoprocessing workload. The framework and auto-scaling algorithms are introduced, and a prototype system was developed to demonstrate the feasibility and efficiency of the proposed scaling mechanism using Digital Elevation Model (DEM) interpolation as an example. Experimental results show that this auto-scaling framework could (1) significantly reduce the computing resource utilization (by 80% in our example) while delivering similar performance as a full-powered cluster; and (2) effectively handle the spike processing workload by automatically increasing the computing resources to ensure the processing is finished within an acceptable time. Such an auto-scaling approach provides a valuable reference to optimize the performance of geospatial applications to address data- and computational-intensity challenges in GIScience in a more cost-efficient manner
    corecore