Search CORE

740 research outputs found

Recommended from our members

Towards Multimodal Spatiotemporal Data Analysis: Heterogeneity and Fusion

Author: Zhang Yawen
Publication venue: University of Colorado Boulder
Publication date: 09/06/2022
Field of study

Spatiaotemporal data are data with space and/or time dimensions. With the widespread use of Internet-of-Things (IoT) technology and deployment of different types of sensors, either mobile or static, large volumes of spatiotemporal data are continuously collected and processed, and have become ubiquitous in the real world. Various services and insights can be delivered via spatiotemporal data, ranging from earthquake early warning, air pollution monitoring and forecasting, to automatic identification of fish schools. A key challenge for deriving information and knowledge from spatiotemporal data lies in handling data heterogeneity. There are various types of heterogeneity associated with spatiotemporal data. The same type of data streams collected may demonstrate different qualities, and multi-source spatiotemporal data often demonstrate different structures, dimensionality, and spatial and temporal resolutions. For spatiotemporal applications, understanding and handling the problem of data heterogeneity are of significant importance. When leveraging multi-source or multimodal data for specific application, a key challenge lies in designing appropriate data fusion techniques to effectively integrate information from spatiotemporal data. To address the above challenges, this dissertation investigates the heterogeneity problem in spatiotemporal data, and proposes fusion methods to integrate information from multimodal spatiotemporal data. We conduct experiments in three applications: (1) analyzing sensing heterogeneity in a global smartphone-based seismic network, (2) fusing multi-source heterogeneous data for air pollution hotspot identification and pollution level prediction, and (3) integrating spatial and temporal information for fish school identification. In application (1), we systematically analyze accelerometer-based sensing quality using millions of acceleration data from the MyShake devices, and investigate various factors that may impact waveform data quality. In application (2), we propose a two-step approach to detect hotspots from mobile sensing data, and leverage cross-domain urban data for hotspot inference. We also propose multi-group Encoder-Decoder networks (MGED-Net) to effectively fuse multi-source data for next-day air quality prediction, which outperforms multiple baseline models. In application (3), we leverage a multi-view learning technique called co-training to integrate contextual information into the classification model, which effectively improve the accuracy of Atlantic herring school identification. We also propose a superpixel-based spatio-temporal contrastive learning method (SSTC) to generate effective representation for fish schools in an unsupervised manner, which outperforms multiple baseline models in two different classification tasks.</p

CU Scholar Institutional Repository

Macau através dos guias turísticos

Author: Yawen Zhang
Publication venue: Universidade de Aveiro
Publication date: 01/01/2017
Field of study

Mestrado em Línguas, Literaturas e CulturasEste trabalho baseia-se na análise textual dos guias turísticos de Macau, procurando os enfoques de cada época e reconstituindo as representações da cidade através dos discursos. Os guias turísticos de Macau, sendo fontes documentais, refletem as mudanças culturais, sociais e urbanísticas dos séculos XX e XXI. Aplica-se a metodologia de análise quantitativa realizada por Eduardo Brito Henriques (1996) para deduzir os enfoques dos guias. Para além disto, a partir das opiniões transmitidas pelos guias turísticos, reconstrói-se a evolução das representações da cidade.This thesis is based on the analysis of Macau’s guidebooks, seeking for the focal point of each period and reconstructing the respective images of the city through the sentences. The guidebooks of Macau, as the documentary sources, reflect the cultural, social and urban changes of the XX and XXI centuries. The study applies the main methodology of the quantitative analysis of Eduardo Brito Henriques (1996), so that the focal points of guides can be found. Besides, from the opinions transmitted by sentences of the guidebooks, the evolution of the images of the city itself can be reconstructed

Repositório Institucional da Universidade de Aveiro

Molecular dynamics simulation study of lipid membranes using coarse-grained models

Author: Zhang Yawen
Publication venue: Chemistry, Imperial College London
Publication date: 01/02/2015
Field of study

In this work we use coarse-grained molecular dynamics simulations to investigate how lipid composition affects the phase transition of phospholipid bilayers. We consider a fully hydrated membrane consisting of saturated 1,2-dipalmitoyl-sn-glycero-3-phosphocholine (DPPC) and cholesterol or unsaturated 1,2-dioleoyol-sn-glycero-3-phosphocholine (DOPC). We report structural, dynamic changes occurring in the model bilayer mixtures with varying temperature and composition. Firstly we study the effect of cholesterol on the properties of a DPPC bilayer. We have combined the computations of area per lipid, radial distribution function, chain order parameter and Voronoi construction to quantify the phase transitions, and the coarse-grained (CG) model is found to quantitatively reproduce most of the experimental observations. Based on the changes in the structural and dynamic properties, a temperature-composition phase diagram of DPPC/cholesterol is proposed and compared with the experiments. `Thread-like' cholesterol clusters in the bilayer at high cholesterol concentrations are observed and the origin of this specific lateral organisation is discussed. To explore the role of the CG bead size, a series of simulations varying the cholesterol cross sectional areas were performed. Parameters obtained from simulation of the different cholesterol isomorphs provide important insight into the microscopic degrees of freedom determining the cholesterol arrangement in the bilayer. The results for the modified cholesterols are further discussed in relation to naturally occurring sterols. Finally, the effect of a mono-unsaturated phospholipid (DOPC) on the main melting phase transition is investigated. This analysis is performed by simulating bilayer systems which were constructed by combining a gel phase DPPC bilayer and a fluid phase DOPC bilayer. The visual observations of the bilayers show that the gel and fluid phases coexist within a wide range of temperature and composition. A temperature-composition phase diagram with phase coexistences is proposed using the information extracted from structural and local composition analysis.Open Acces

Spiral - Imperial College Digital Repository

WRHT: Efficient All-reduce for Distributed DNN Training in Optical Interconnect System

Author: Chen Yawen
Dai Fei
Huang Zhiyi
Zhang Fangfang
Zhang Haibo
Publication venue
Publication date: 22/07/2022
Field of study

Communication efficiency plays an important role in accelerating the distributed training of Deep Neural Networks (DNN). All-reduce is the key communication primitive to reduce model parameters in distributed DNN training. Most existing all-reduce algorithms are designed for traditional electrical interconnect systems, which cannot meet the communication requirements for distributed training of large DNNs. One of the promising alternatives for electrical interconnect is optical interconnect, which can provide high bandwidth, low transmission delay, and low power cost. We propose an efficient scheme called WRHT (Wavelength Reused Hierarchical Tree) for implementing all-reduce operation in optical interconnect system, which can take advantage of WDM (Wavelength Division Multiplexing) to reduce the communication time of distributed data-parallel DNN training. We further derive the minimum number of communication steps and communication time to realize the all-reduce using WRHT. Simulation results show that the communication time of WRHT is reduced by 75.59%, 49.25%, and 70.1% respectively compared with three traditional all-reduce algorithms simulated in optical interconnect system. Simulation results also show that WRHT can reduce the communication time for all-reduce operation by 86.69% and 84.71% in comparison with two existing all-reduce algorithms in electrical interconnect system.Comment: This paper is under the submission of GLOBECOM 202

arXiv.org e-Print Archive

OpTree: An Efficient Algorithm for All-gather Operation in Optical Interconnect Systems

Author: Chen Yawen
Dai Fei
Huang Zhiyi
Zhang Haibo
Publication venue
Publication date: 28/11/2022
Field of study

All-gather collective communication is one of the most important communication primitives in parallel and distributed computation, which plays an essential role in many HPC applications such as distributed Deep Learning (DL) with model and hybrid parallelism. To solve the communication bottleneck of All-gather, optical interconnection network can provide unprecedented high bandwidth and reliability for data transfer among the distributed nodes. However, most traditional All-gather algorithms are designed for electrical interconnection, which cannot fit well for optical interconnect systems, resulting in poor performance. This paper proposes an efficient scheme, called OpTree, for All-gather operation on optical interconnect systems. OpTree derives an optimal

m

-ary tree corresponding to the optimal number of communication stages, achieving minimum communication time. We further analyze and compare the communication steps of OpTree with existing All-gather algorithms. Theoretical results exhibit that OpTree requires much less number of communication steps than existing All-gather algorithms on optical interconnect systems. Simulation results show that OpTree can reduce communication time by 72.21%, 94.30%, and 88.58%, respectively, compared with three existing All-gather schemes, WRHT, Ring, and NE.Comment: This paper is under review at a conferenc

arXiv.org e-Print Archive

Accelerating Fully Connected Neural Network on Optical Network-on-Chip (ONoC)

Author: Chen Yawen
Dai Fei
Huang Zhiyi
Zhang Haibo
Publication venue
Publication date: 30/09/2021
Field of study

Fully Connected Neural Network (FCNN) is a class of Artificial Neural Networks widely used in computer science and engineering, whereas the training process can take a long time with large datasets in existing many-core systems. Optical Network-on-Chip (ONoC), an emerging chip-scale optical interconnection technology, has great potential to accelerate the training of FCNN with low transmission delay, low power consumption, and high throughput. However, existing methods based on Electrical Network-on-Chip (ENoC) cannot fit in ONoC because of the unique properties of ONoC. In this paper, we propose a fine-grained parallel computing model for accelerating FCNN training on ONoC and derive the optimal number of cores for each execution stage with the objective of minimizing the total amount of time to complete one epoch of FCNN training. To allocate the optimal number of cores for each execution stage, we present three mapping strategies and compare their advantages and disadvantages in terms of hotspot level, memory requirement, and state transitions. Simulation results show that the average prediction error for the optimal number of cores in NN benchmarks is within 2.3%. We further carry out extensive simulations which demonstrate that FCNN training time can be reduced by 22.28% and 4.91% on average using our proposed scheme, compared with traditional parallel computing methods that either allocate a fixed number of cores or allocate as many cores as possible, respectively. Compared with ENoC, simulation results show that under batch sizes of 64 and 128, on average ONoC can achieve 21.02% and 12.95% on reducing training time with 47.85% and 39.27% on saving energy, respectively.Comment: 14 pages, 10 figures. This paper is under the second review of IEEE Transactions of Computer

arXiv.org e-Print Archive