Search CORE

173 research outputs found

Recommended from our members

Multidimensional Cultural Perception and Spatial Differentiation in the Yellow River Basin of China

Author: Han Quan
Li Xiaomeng
Petrick James
Qin Jing
Publication venue: ScholarWorks@UMass Amherst
Publication date: 15/06/2023
Field of study

Based on the travel blogs of 69 prefecture-level cities along the Yellow River basin, this study constructed a six-dimensional cultural perception system from the perspective of tourists, analyzed the similarities and differences in cultural perception across 6 dimensions between 69 prefecture-level cities using a deep learning textual thematic recognition method, and further proposed 10 cultural tourism regions with different cultural themes through the combination of a network-oriented geographic regionalization method and a text mining method. Lastly, the comparison of tourists’ cultural perceptions and emotional evaluations across 10 cultural tourism regions led to the conclusion that it is necessary to promote the Yellow River basin as an integrated and compound destination. This study contributed to the growing literature on cultural perception in travel and tourism research and proposed an approach to develop regional tourism for destination marketing organizations in the Yellow River basin

ScholarWorks@UMass Amherst

HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving

Author: Ding Xinpeng
Han Jianhua
Li Xiaomeng
Xu Hang
Zhang Wei
Publication venue
Publication date: 10/09/2023
Field of study

Autonomous driving systems generally employ separate models for different tasks resulting in intricate designs. For the first time, we leverage singular multimodal large language models (MLLMs) to consolidate multiple autonomous driving tasks from videos, i.e., the Risk Object Localization and Intention and Suggestion Prediction (ROLISP) task. ROLISP uses natural language to simultaneously identify and interpret risk objects, understand ego-vehicle intentions, and provide motion suggestions, eliminating the necessity for task-specific architectures. However, lacking high-resolution (HR) information, existing MLLMs often miss small objects (e.g., traffic cones) and overly focus on salient ones (e.g., large trucks) when applied to ROLISP. We propose HiLM-D (Towards High-Resolution Understanding in MLLMs for Autonomous Driving), an efficient method to incorporate HR information into MLLMs for the ROLISP task. Especially, HiLM-D integrates two branches: (i) the low-resolution reasoning branch, can be any MLLMs, processes low-resolution videos to caption risk objects and discern ego-vehicle intentions/suggestions; (ii) the high-resolution perception branch (HR-PB), prominent to HiLM-D,, ingests HR images to enhance detection by capturing vision-specific HR feature maps and prioritizing all potential risks over merely salient objects. Our HR-PB serves as a plug-and-play module, seamlessly fitting into current MLLMs. Experiments on the ROLISP benchmark reveal HiLM-D's notable advantage over leading MLLMs, with improvements of 4.8% in BLEU-4 for captioning and 17.2% in mIoU for detection

arXiv.org e-Print Archive

Angle-Aware and Tone-Aware Luminosity Analysis for Paper Model Surface

Author: Guangyuan Wu
Jingjing Liu
Mingming Cui
Xiaomeng Han
Xiaozhou Li*
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2021
Field of study

Luminosity contributes to the paper model surface perception. It has a significant impact on the perception of colour and details. The main purpose of this paper is to study the reflection luminosity of paper model surface which can be of complex or difficult shape surface. The final perception quality of a product, whether it is plain or 3D or other different shape, depends on the surface luminosity perceived by the receptor, such as eyes or measurement instruments. However, the number of parameters and limits of the paper model surface are enormous. It is a time-consuming work to select every parameter by a trial-and-error procedure. For a paper surface under the fixed lighting environment, the most important factors to decide the performance of perception are commonly viewing angles and surface tone. Therefore, the two related terms, perception angle and surface tone, were chosen to work in the analysis process. The final analysis, based on the initial conditions, enabled to predict the perception of paper model surface and to set the optimal perceived angels and tones. It still proposed the next step to model the perception of paper model surface of different shapes in a relatively short period

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Luminance Prediction of Paper Model Surface Based on Non-Contact Measurement

Author: Guangyuan Wu
Jingjing Liu
Mingming Cui
Xiaomeng Han
Xiaozhou Li*
Xuelin Li
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2021
Field of study

The overall appearance perception is affected by luminance perception accuracy and efficiency mostly. The surface luminance prediction correlated with surface angle and surface tone value was performed by measuring and modeling the paper model surface luminance. First, we used a rotating bracket designed to facilitate to set the paper surface angle. Then, we set the surface angle from 5° to 85° at the interval of 5° using the designed rotating bracket. Additionally, the four primary color scales, cyan, magenta, yellow, and black, were printed and set at the designed angle. The angle-ware and tone-ware luminance was measured using spectroradiometer, CS-2000. Finally, we proposed and evaluated a mathematical model to reveal the relationship between luminance and surface angle and surface tone using the least squares method. The results indicated that the surface luminance of paper model could be predicted and obtained quickly and accurately for any surface angles and surface tone values by the proposed prediction model

Directory of Open Access Journals

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions

Author: Chai Yesheng
Dai Jiao
Fu Xiaomeng
Han Jizhong
Liu Jin
Wang Xi
Yu Cai
Publication venue
Publication date: 27/09/2023
Field of study

One-shot talking head generation has no explicit head movement reference, thus it is difficult to generate talking heads with head motions. Some existing works only edit the mouth area and generate still talking heads, leading to unreal talking head performance. Other works construct one-to-one mapping between audio signal and head motion sequences, introducing ambiguity correspondences into the mapping since people can behave differently in head motions when speaking the same content. This unreasonable mapping form fails to model the diversity and produces either nearly static or even exaggerated head motions, which are unnatural and strange. Therefore, the one-shot talking head generation task is actually a one-to-many ill-posed problem and people present diverse head motions when speaking. Based on the above observation, we propose OSM-Net, a \textit{one-to-many} one-shot talking head generation network with natural head motions. OSM-Net constructs a motion space that contains rich and various clip-level head motion features. Each basis of the space represents a feature of meaningful head motion in a clip rather than just a frame, thus providing more coherent and natural motion changes in talking heads. The driving audio is mapped into the motion space, around which various motion features can be sampled within a reasonable range to achieve the one-to-many mapping. Besides, the landmark constraint and time window feature input improve the accurate expression feature extraction and video generation. Extensive experiments show that OSM-Net generates more natural realistic head motions under reasonable one-to-many mapping paradigm compared with other methods.Comment: Paper Under Revie

arXiv.org e-Print Archive

MFR-Net: Multi-faceted Responsive Listening Head Generation via Denoising Diffusion Model

Author: Chai Yesheng
Dai Jiao
Fu Xiaomeng
Han Jizhong
Liu Jin
Wang Xi
Yu Cai
Publication venue
Publication date: 31/08/2023
Field of study

Face-to-face communication is a common scenario including roles of speakers and listeners. Most existing research methods focus on producing speaker videos, while the generation of listener heads remains largely overlooked. Responsive listening head generation is an important task that aims to model face-to-face communication scenarios by generating a listener head video given a speaker video and a listener head image. An ideal generated responsive listening video should respond to the speaker with attitude or viewpoint expressing while maintaining diversity in interaction patterns and accuracy in listener identity information. To achieve this goal, we propose the \textbf{M}ulti-\textbf{F}aceted \textbf{R}esponsive Listening Head Generation Network (MFR-Net). Specifically, MFR-Net employs the probabilistic denoising diffusion model to predict diverse head pose and expression features. In order to perform multi-faceted response to the speaker video, while maintaining accurate listener identity preservation, we design the Feature Aggregation Module to boost listener identity features and fuse them with other speaker-related features. Finally, a renderer finetuned with identity consistency loss produces the final listening head videos. Our extensive experiments demonstrate that MFR-Net not only achieves multi-faceted responses in diversity and speaker identity information but also in attitude and viewpoint expression.Comment: Accepted by ACM MM 202

arXiv.org e-Print Archive

FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions

Author: Chai Yesheng
Dai Jiao
Fu Xiaomeng
Han Jizhong
Liu Jin
Wang Xi
Yu Cai
Publication venue
Publication date: 30/03/2023
Field of study

One-shot talking head generation has received growing attention in recent years, with various creative and practical applications. An ideal natural and vivid generated talking head video should contain natural head pose changes. However, it is challenging to map head pose sequences from driving audio since there exists a natural gap between audio-visual modalities. In this work, we propose a Flow-guided One-shot model that achieves NaTural head motions(FONT) over generated talking heads. Specifically, the head pose prediction module is designed to generate head pose sequences from the source face and driving audio. We add the random sampling operation and the structural similarity constraint to model the diversity in the one-to-many mapping between audio-visual modality, thus predicting natural head poses. Then we develop a keypoint predictor that produces unsupervised keypoints from the source face, driving audio and pose sequences to describe the facial structure information. Finally, a flow-guided occlusion-aware generator is employed to produce photo-realistic talking head videos from the estimated keypoints and source face. Extensive experimental results prove that FONT generates talking heads with natural head poses and synchronized mouth shapes, outperforming other compared methods.Comment: Accepted by ICME202

arXiv.org e-Print Archive

OPT: One-shot Pose-Controllable Talking Head Generation

Author: Chai Yesheng
Dai Jiao
Fu Xiaomeng
Han Jizhong
Liu Jin
Wang Xi
Yu Cai
Publication venue
Publication date: 16/02/2023
Field of study

One-shot talking head generation produces lip-sync talking heads based on arbitrary audio and one source face. To guarantee the naturalness and realness, recent methods propose to achieve free pose control instead of simply editing mouth areas. However, existing methods do not preserve accurate identity of source face when generating head motions. To solve the identity mismatch problem and achieve high-quality free pose control, we present One-shot Pose-controllable Talking head generation network (OPT). Specifically, the Audio Feature Disentanglement Module separates content features from audios, eliminating the influence of speaker-specific information contained in arbitrary driving audios. Later, the mouth expression feature is extracted from the content feature and source face, during which the landmark loss is designed to enhance the accuracy of facial structure and identity preserving quality. Finally, to achieve free pose control, controllable head pose features from reference videos are fed into the Video Generator along with the expression feature and source face to generate new talking heads. Extensive quantitative and qualitative experimental results verify that OPT generates high-quality pose-controllable talking heads with no identity mismatch problem, outperforming previous SOTA methods.Comment: Accepted by ICASSP202

arXiv.org e-Print Archive

Preparation and Characterization of Chitosan/β-Glycerophosphate Thermal-Sensitive Hydrogel Reinforced by Graphene Oxide

Author: Han Qin
Jian Wang
Qianbing Wan
Tong Wang
Xiaomeng Gao
Xibo Pei
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Thermal-sensitive hydrogel based on chitosan (CS) and β-glycerophosphate (GP) has shown good biocompatibility and biodegradability. But the application of such hydrogel is limited due to its poor mechanical property. Recently, graphene oxide(GO) is widely used as a reinforcement agent to prepare nanocomposites with different polymers for improving the properties of the materials. In this study, CS/GP-based hydrogels with different weight ratio of GO/CS (0.5, 1, 2%) were fabricated. The gelation time of the hydrogels at body temperature was evaluated by tube inverting method. The gelation process during heating was monitored by rheological measurement. The morphology, porosities, chemical structure, swelling properties of the lyophilized hydrogels were investigated by scanning electron microscopy, liquid displacement method, Fourier transform infrared spectroscopy and gravimetric method. Mechanical property of the hydrogels was analyzed by rheological measurement and unconfined compression test. MC3T3-E1 mouse pre-osteoblast cell line was used to assess the biological properties of the hydrogels. The results obtained from those assessments revealed that the addition of GO into CS/GP improved the properties of the prepared hydrogels without changing the high porous and interconnected microstructure and swelling ability of the hydrogels. The gelation time at body temperature was significantly reduced by nearly 20% with the addition of small amount of GO (0.5% weight ratio of CS). The mechanical properties of the hydrogels containing GO were improved significantly over that of CS/GP. The storage (G′)/loss (G″) moduli of the hydrogels with GO were 1.12 to 1.69 times that of CS/GP at the gelling temperature. The Young's modulus of 0.5%GO/CS/GP hydrogel is 1.76 times that of CS/GP. Moreover, the 0.5%GO/CS/GP hydrogel revealed remarkable biological affinity such as cellular attachment, viability and proliferation. All of these results suggest that 0.5%GO/CS/GP hydrogel has great potential for practical application in biomedical field

Directory of Open Access Journals

Frontiers - Publisher Connector

Tunable hysteresis effect for perovskite solar cells

Author: Anyi Mei
Daiyu Li
Hongwei Han
Huawei Liu
Juan Bisquert
Mi Xu
Qifei Wang
Sandheep Ravishankar
Xiaomeng Hou
Yaoguang Rong
Yue Hu
Yusong Sheng
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2017
Field of study

Perovskite solar cells (PSCs) usually suffer from a hysteresis effect in current–voltage measurements, which leads to an inaccurate estimation of the device e fficiency. Although ion migration, charge trapping/ detrapping, and accumulation have been proposed as a b asis for the hysteresis, the origin of the hysteresis has not been apparently unraveled. Herein we reporte d a tunable hysteresis effect based uniquely on open- circuit voltage variations in printable mesos copic PSCs with a simplified triple-layer TiO 2 /ZrO 2 /carbon architecture. The electrons are collected by the compact TiO 2 /mesoporous TiO 2 (c-TiO 2 /mp-TiO 2 )bilayer, and the holes are collected by the carbon layer. By adj usting the spray deposition cycles for the c-TiO 2 layer andUV-ozonetreatment,weachievedhysteresis-norm al, hysteresis-free, and hysteresis-inverted PSCs. Such unique trends of tunable hysteresis are anal yzed by considering the polarization of the TiO 2 /perovskite interface, which can accumulate positive charges reversibly. Successfully tuning of the hysteresis effect clarifies the critical importance of the c-TiO 2 /perovskite interface in controlling the hysteretic trends observed, providing important insights towards the understanding of this rapidly developing photovoltaic technology

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositori Institucional de la Universitat Jaume I