Search CORE

721 research outputs found

The geography of city liveliness and consumption: evidence from location-based big data

Author: Li Chengyu
Wang Jianghao
Wang Mark
Wu Wenjie
Publication venue: Spatial Economics Research Centre (SERC), London School of Economics and Political Science
Publication date: 01/11/2016
Field of study

Understanding the complexity in the connection between city liveliness and spatial configurationsfor consumptive amenities has been an important but understudied research field in fast urbanising countries like China. This paper presents the first step towards filling this gap though location-based big data perspectives. City liveliness is measured by aggregated spacetime human activity intensities using mobile phone positioning data.Consumptive amenities are identified by point-of-interest data from Chinese Yelp website (dian ping). The results provide the insights into the geographic contextual uncertainties of consumptive amenities in shaping the rise and fall in the vibrancy of city liveliness

Heriot Watt Pure

LSE Research Online

Estimating Target Heights Based on the Earth Curvature Model and Micromultipath Effect in Skywave OTH Radar

Author: Chen Jiawei
Hou Chengyu
Wang Yuxin
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Skywave over-the-horizon (OTH) radar systems have important long-range strategic warning values. They exploit skywave propagation reflection of high frequency signals from the ionosphere, which provides the ultra-long-range surveillance capabilities to detect and track maneuvering targets. Current OTH radar systems are capable of localizing targets in range and azimuth but are unable to achieve reliable instantaneous altitude estimation. Most existing height measurement methods of skywave OTH radar systems have taken advantage of the micromultipath effect and been considered in the flat earth model. However, the flat earth model is not proper since large error is inevitable, when the detection range is over one thousand kilometers. In order to avoid the error caused by the flat earth model, in this paper, an earth curvature model is introduced into OTH radar altimetry methods. The simulation results show that application of the earth curvature model can effectively reduce the estimation error

Crossref

Directory of Open Access Journals

Debiasing Made State-of-the-art: Revisiting the Simple Seed-based Weak Supervision for Text Classification

Author: Dong Chengyu
Shang Jingbo
Wang Zihan
Publication venue
Publication date: 24/05/2023
Field of study

Recent advances in weakly supervised text classification mostly focus on designing sophisticated methods to turn high-level human heuristics into quality pseudo-labels. In this paper, we revisit the seed matching-based method, which is arguably the simplest way to generate pseudo-labels, and show that its power was greatly underestimated. We show that the limited performance of seed matching is largely due to the label bias injected by the simple seed-match rule, which prevents the classifier from learning reliable confidence for selecting high-quality pseudo-labels. Interestingly, simply deleting the seed words present in the matched input texts can mitigate the label bias and help learn better confidence. Subsequently, the performance achieved by seed matching can be improved significantly, making it on par with or even better than the state-of-the-art. Furthermore, to handle the case when the seed words are not made known, we propose to simply delete the word tokens in the input text randomly with a high deletion ratio. Remarkably, seed matching equipped with this random deletion method can often achieve even better performance than that with seed deletion

arXiv.org e-Print Archive

Variational Estimation for Multidimensional Generalized Partial Credit Model

Author: Cui Chengyu
Wang Chun
Xu Gongjun
Publication venue
Publication date: 23/01/2024
Field of study

Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap, this paper presents a novel Gaussian variational estimation algorithm for the multidimensional generalized partial credit model (MGPCM). The proposed algorithm demonstrates both fast and accurate performance, as illustrated through a series of simulation studies and two real data analyses

arXiv.org e-Print Archive

ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval

Author: Huang Jun
Jin Lianwen
Wang Chengyu
Wang Jiapeng
Wang Xiaodan
Publication venue
Publication date: 28/05/2023
Field of study

Large-scale pre-trained text-image models with dual-encoder architectures (such as CLIP) are typically adopted for various vision-language applications, including text-image retrieval. However,these models are still less practical on edge devices or for real-time situations, due to the substantial indexing and inference time and the large consumption of computational resources. Although knowledge distillation techniques have been widely utilized for uni-modal model compression, how to expand them to the situation when the numbers of modalities and teachers/students are doubled has been rarely studied. In this paper, we conduct comprehensive experiments on this topic and propose the fully-Connected knowledge interaction graph (Cona) technique for cross-modal pre-training distillation. Based on our findings, the resulting ConaCLIP achieves SOTA performances on the widely-used Flickr30K and MSCOCO benchmarks under the lightweight setting. An industry application of our method on an e-commercial platform further demonstrates the significant effectiveness of ConaCLIP.Comment: ACL 2023 Industry Trac

arXiv.org e-Print Archive

Distances to the Supernova Remnants in the Inner Disk

Author: Chen Bingqiu
Chen Xiaodian
Gao Jian
Jiang Biwei
Liu Jifeng
Wang Shu
Zhang Chengyu
Zhao He
Publication venue: 'EDP Sciences'
Publication date: 17/05/2020
Field of study

Distance measurements of supernova remnants (SNRs) are essential and important. Accurate estimates of physical size, dust masses, and some other properties of SNRs depend critically on accurate distance measurements. However, the determination of SNR distances is still a tough task. Red clump stars (RCs) have a long history been used as standard candles. In this work, we take RCs as tracers to determine the distances to a large group of SNRs in the inner disk. We first select RC stars based on the near-infrared (IR) color-magnitude diagram (CMD). Then, the distance to and extinction of RC stars are calculated. To extend the measurable range of distance, we combine near-IR photometric data from the 2MASS survey with the deeper UKIDSS and VVV surveys. With the help of the Gaia parallaxes, we also remove contaminants including dwarfs and giants. Because an SN explosion compresses the surrounding interstellar medium, the SNR region would become denser and exhibit higher extinction than the surroundings. The distance of a SNR is then recognized by the position where the extinction and its gradient is higher than that of the ambient medium. A total of 63 SNRs' distances in the Galactic inner disk are determined and divided into three Levels A, B, and C with decreasing reliability. The distances to 43 SNRs are well determined with reliability A or B. The diameters and dust masses of SNRs are estimated with the obtained distance and extinction.Comment: 31 pages, 25 figures, 2 tables, accepted for publication in A&

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Boosting In-Context Learning with Factual Knowledge

Author: Gao Ming
Huang Jun
Tan Chuanqi
Wang Chengyu
Wang Jianing
Publication venue
Publication date: 26/09/2023
Field of study

In-Context Learning (ICL) over Large language models (LLMs) aims at solving previously unseen tasks by conditioning on a few training examples, eliminating the need for parameter updates and achieving competitive performance. In this paper, we demonstrate that factual knowledge is imperative for the performance of ICL in three core facets, i.e., the inherent knowledge learned in LLMs, the factual knowledge derived from the selected in-context examples, and the knowledge biases in LLMs for output generation. To unleash the power of LLMs in few-shot learning scenarios, we introduce a novel Knowledgeable In-Context Tuning (KICT) framework to further improve the performance of ICL: 1) injecting factual knowledge to LLMs during continual self-supervised pre-training, 2) judiciously selecting the examples with high knowledge relevance, and 3) calibrating the prediction results based on prior knowledge. We evaluate the proposed approaches on auto-regressive LLMs (e.g., GPT-style models) over multiple text classification and question answering tasks. Experimental results demonstrate that KICT substantially outperforms strong baselines, and improves by more than 13% and 7% of accuracy on text classification and question answering tasks, respectively

arXiv.org e-Print Archive