Search CORE

171 research outputs found

A Probabilistic Embedding Clustering Method for Urban Structure Detection

Author: H. Li
L. Gao
L. Zhao
M. Deng
X. Lin
X. Lin
Y. Zhang
Publication venue
Publication date: 12/07/2017
Field of study

Urban structure detection is a basic task in urban geography. Clustering is a core technology to detect the patterns of urban spatial structure, urban functional region, and so on. In big data era, diverse urban sensing datasets recording information like human behaviour and human social activity, suffer from complexity in high dimension and high noise. And unfortunately, the state-of-the-art clustering methods does not handle the problem with high dimension and high noise issues concurrently. In this paper, a probabilistic embedding clustering method is proposed. Firstly, we come up with a Probabilistic Embedding Model (PEM) to find latent features from high dimensional urban sensing data by learning via probabilistic model. By latent features, we could catch essential features hidden in high dimensional data known as patterns; with the probabilistic model, we can also reduce uncertainty caused by high noise. Secondly, through tuning the parameters, our model could discover two kinds of urban structure, the homophily and structural equivalence, which means communities with intensive interaction or in the same roles in urban structure. We evaluated the performance of our model by conducting experiments on real-world data and experiments with real data in Shanghai (China) proved that our method could discover two kinds of urban structure, the homophily and structural equivalence, which means clustering community with intensive interaction or under the same roles in urban space.Comment: 6 pages, 7 figures, ICSDM201

arXiv.org e-Print Archive

Directory of Open Access Journals

Revealing intra-urban spatial structure through an exploratory analysis by combining road network abstraction model and taxi trajectory data

Author: Gao Song
Hu Sheng
Li Tianqi
Luo Wei
Wu Liang
Xu Yongyang
Zhang Ziwei
Publication venue
Publication date: 21/11/2022
Field of study

The unprecedented urbanization in China has dramatically changed the urban spatial structure of cities. With the proliferation of individual-level geospatial big data, previous studies have widely used the network abstraction model to reveal the underlying urban spatial structure. However, the construction of network abstraction models primarily focuses on the topology of the road network without considering individual travel flows along with the road networks. Individual travel flows reflect the urban dynamics, which can further help understand the underlying spatial structure. This study therefore aims to reveal the intra-urban spatial structure by integrating the road network abstraction model and individual travel flows. To achieve this goal, we 1) quantify the spatial interaction relatedness of road segments based on the Word2Vec model using large volumes of taxi trip data, then 2) characterize the road abstraction network model according to the identified spatial interaction relatedness, and 3) implement a community detection algorithm to reveal sub-regions of a city. Our results reveal three levels of hierarchical spatial structures in the Wuhan metropolitan area. This study provides a data-driven approach to the investigation of urban spatial structure via identifying traffic interaction patterns on the road network, offering insights to urban planning practice and transportation management

arXiv.org e-Print Archive

Conflating point of interest (POI) data: A systematic review of matching methods

Author: Hu Yingjie
Ma Yue
Sun Kai
Zhou Ryan Zhenqi
Zhu Yunqiang
Publication venue
Publication date: 23/10/2023
Field of study

Point of interest (POI) data provide digital representations of places in the real world, and have been increasingly used to understand human-place interactions, support urban management, and build smart cities. Many POI datasets have been developed, which often have different geographic coverages, attribute focuses, and data quality. From time to time, researchers may need to conflate two or more POI datasets in order to build a better representation of the places in the study areas. While various POI conflation methods have been developed, there lacks a systematic review, and consequently, it is difficult for researchers new to POI conflation to quickly grasp and use these existing methods. This paper fills such a gap. Following the protocol of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), we conduct a systematic review by searching through three bibliographic databases using reproducible syntax to identify related studies. We then focus on a main step of POI conflation, i.e., POI matching, and systematically summarize and categorize the identified methods. Current limitations and future opportunities are discussed afterwards. We hope that this review can provide some guidance for researchers interested in conflating POI datasets for their research

arXiv.org e-Print Archive

Recommended from our members

Crowdsourced Data Mining for Urban Activity: A Review of Data Sources, Applications and Methods

Author: Niu Haifeng
Silva Elisabete
Publication venue: Journal of Urban Planning and Development
Publication date: 01/01/2020
Field of study

The penetration of devices integrated with location-based services and internet services has generated massive data about the everyday life of citizens and tracked their activities happening in cities. Crowdsourced data, such as social media data, POIs data and collaborative websites, generated by the crowd, has become fine-grained proxy data of urban activity and widely used in research in urban studies. However, due to the heterogeneity of data types of crowdsourced data and the limitation of previous studies mainly focusing on a specific application, a systematic review of crowdsourced data mining for urban activity is still lacking. In order to fill the gap, this paper conducts a literature search in the Web of Science database, selecting 226 highly related papers published between 2013 and 2019. Based on those papers, the review firstly conducts a bibliometric analysis identifying underpinning domains, pivot scholars and papers around this topic. The review also synthesises previous research into three parts: main applications of different data sources and data fusion; application of spatial analysis in mobility patterns, functional areas and event detection; application of socio-demographic and perception analysis in city attractiveness, demographic characteristics and sentiment analysis. The challenges of this type of data are also discussed in the end. This study provides a systematic and current review for both researchers and practitioners interested in the applications of crowdsourced data mining for urban activity.This research is funded by a scholarship from the China Scholarship Counci

Apollo (Cambridge)

SENSING URBAN LAND-USE PATTERNS BY INTEGRATING GOOGLE TENSORFLOW AND SCENE-CLASSIFICATION MODELS

Author
Publication venue: 'Copernicus GmbH'
Publication date
Field of study

Crossref

A Data-driven, High-performance and Intelligent CyberInfrastructure to Advance Spatial Sciences

Author
Publication venue
Publication date: 01/01/2018
Field of study

abstract: In the field of Geographic Information Science (GIScience), we have witnessed the unprecedented data deluge brought about by the rapid advancement of high-resolution data observing technologies. For example, with the advancement of Earth Observation (EO) technologies, a massive amount of EO data including remote sensing data and other sensor observation data about earthquake, climate, ocean, hydrology, volcano, glacier, etc., are being collected on a daily basis by a wide range of organizations. In addition to the observation data, human-generated data including microblogs, photos, consumption records, evaluations, unstructured webpages and other Volunteered Geographical Information (VGI) are incessantly generated and shared on the Internet. Meanwhile, the emerging cyberinfrastructure rapidly increases our capacity for handling such massive data with regard to data collection and management, data integration and interoperability, data transmission and visualization, high-performance computing, etc. Cyberinfrastructure (CI) consists of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high-performance networks to improve research productivity and enable breakthroughs that are not otherwise possible. The Geospatial CI (GCI, or CyberGIS), as the synthesis of CI and GIScience has inherent advantages in enabling computationally intensive spatial analysis and modeling (SAM) and collaborative geospatial problem solving and decision making. This dissertation is dedicated to addressing several critical issues and improving the performance of existing methodologies and systems in the field of CyberGIS. My dissertation will include three parts: The first part is focused on developing methodologies to help public researchers find appropriate open geo-spatial datasets from millions of records provided by thousands of organizations scattered around the world efficiently and effectively. Machine learning and semantic search methods will be utilized in this research. The second part develops an interoperable and replicable geoprocessing service by synthesizing the high-performance computing (HPC) environment, the core spatial statistic/analysis algorithms from the widely adopted open source python package – Python Spatial Analysis Library (PySAL), and rich datasets acquired from the first research. The third part is dedicated to studying optimization strategies for feature data transmission and visualization. This study is intended for solving the performance issue in large feature data transmission through the Internet and visualization on the client (browser) side. Taken together, the three parts constitute an endeavor towards the methodological improvement and implementation practice of the data-driven, high-performance and intelligent CI to advance spatial sciences.Dissertation/ThesisDoctoral Dissertation Geography 201

ASU Digital Repository

a framework to explore correlations between space-based and place-based user-generated content

Author: Painho Marco
Tang Vicente
Publication venue
Publication date: 03/08/2023
Field of study

Tang, V., & Painho, M. (2023). Content-location relationships: a framework to explore correlations between space-based and place-based user-generated content. International Journal Of Geographical Information Science, 37(8), 1840–1871. https://doi.org/10.1080/13658816.2023.2213869 ---The authors acknowledge the funding from the Portuguese national funding agency for science, research and technology (Fundação para a Ciência e a Tecnologia – FCT) through the CityMe project (EXPL/GES-URB/1429/2021; https://cityme.novaims.unl.pt/) and the project UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS.The use of social media and location-based networks through GPS-enabled devices provides geospatial data for a plethora of applications in urban studies. However, the extent to which information found in geo-tagged social media activity corresponds to the spatial context is still a topic of debate. In this article, we developed a framework aimed at retrieving the thematic and spatial relationships between content originated from space-based (Twitter) and place-based (Google Places and OSM) sources of geographic user-generated content based on topics identified by the embedding-based BERTopic model. The contribution of the framework lies on the combination of methods that were selected to improve previous works focused on content-location relationships. Using the city of Lisbon (Portugal) to test our methodology, we first applied the embedding-based topic model to aggregated textual data coming from each source. Results of the analysis evidenced the complexity of content-location relationships, which are mostly based on thematic profiles. Nonetheless, the framework can be employed in other cities and extended with other metrics to enrich the research aimed at exploring the correlation between online discourse and geography.publishersversionpublishe

Repositório da Universidade Nova de Lisboa