Search CORE

65 research outputs found

MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base

Author: He Qianyu
Liang Jiaqing
Wang Xintao
Xiao Yanghua
Publication venue
Publication date: 10/12/2022
Field of study

The ability to understand and generate similes is an imperative step to realize human-level AI. However, there is still a considerable gap between machine intelligence and human cognition in similes, since deep models based on statistical distribution tend to favour high-frequency similes. Hence, a large-scale symbolic knowledge base of similes is required, as it contributes to the modeling of diverse yet unpopular similes while facilitating additional evaluation and reasoning. To bridge the gap, we propose a novel framework for large-scale simile knowledge base construction, as well as two probabilistic metrics which enable an improved understanding of simile phenomena in natural language. Overall, we construct MAPS-KB, a million-scale probabilistic simile knowledge base, covering 4.3 million triplets over 0.4 million terms from 70 GB corpora. We conduct sufficient experiments to justify the effectiveness and necessity of the methods of our framework. We also apply MAPS-KB on three downstream tasks to achieve state-of-the-art performance, further demonstrating the value of MAPS-KB.Comment: Accepted to AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Language Models as Knowledge Embeddings

Author: He Qianyu
Liang Jiaqing
Wang Xintao
Xiao Yanghua
Publication venue
Publication date: 29/06/2023
Field of study

Knowledge embeddings (KE) represent a knowledge graph (KG) by embedding entities and relations into continuous vector spaces. Existing methods are mainly structure-based or description-based. Structure-based methods learn representations that preserve the inherent structure of KGs. They cannot well represent abundant long-tail entities in real-world KGs with limited structural information. Description-based methods leverage textual information and language models. Prior approaches in this direction barely outperform structure-based ones, and suffer from problems like expensive negative sampling and restrictive description demand. In this paper, we propose LMKE, which adopts Language Models to derive Knowledge Embeddings, aiming at both enriching representations of long-tail entities and solving problems of prior description-based methods. We formulate description-based KE learning with a contrastive learning framework to improve efficiency in training and evaluation. Experimental results show that LMKE achieves state-of-the-art performance on KE benchmarks of link prediction and triple classification, especially for long-tail entities.Comment: This revision corrects some texts after fixing a data leakage issu

arXiv.org e-Print Archive

An Efficient Robust Eye Localization by Learning the Convolution Distribution Using Eye Template

Author: Jiaqing Xu
Ruorong Xiao
Xin Niu
Xuan Li
Yong Dou
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Eye localization is a fundamental process in many facial analyses. In practical use, it is often challenged by illumination, head pose, facial expression, occlusion, and other factors. It remains great difficulty to achieve high accuracy with short prediction time and low training cost at the same time. This paper presents a novel eye localization approach which explores only one-layer convolution map by eye template using a BP network. Results showed that the proposed method is robust to handle many difficult situations. In experiments, accuracy of 98% and 96%, respectively, on the BioID and LFPW test sets could be achieved in 10 fps prediction rate with only 15-minute training cost. In comparison with other robust models, the proposed method could obtain similar best results with greatly reduced training time and high prediction speed

Crossref

Directory of Open Access Journals

PubMed Central

The Dissolved Oxygen Prediction Method Based on Neural Network

Author: Haohuai Liu
Jiaqing Wang
Lingxi Peng
Yangang Nie
Yi Chen
Zhong Xiao
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Crossref

Can Large Language Models Understand Real-World Complex Instructions?

Author: Chen Lida
Chen Lina
Chen Shisong
Gu Zhouhong
He Qianxi
He Qianyu
Huang Wenhao
Huang Yuncheng
Li Zihan
Liang Jiaqing
Wang Xintao
Xiao Jin
Xiao Yanghua
Ye Haoning
Zeng Jie
Zhang Yikai
Zhou Xunzhe
Publication venue
Publication date: 17/09/2023
Field of study

Large language models (LLMs) can understand human instructions, showing their potential for pragmatic applications beyond traditional NLP tasks. However, they still struggle with complex instructions, which can be either complex task descriptions that require multiple tasks and constraints, or complex input that contains long context, noise, heterogeneous information and multi-turn format. Due to these features, LLMs often ignore semantic constraints from task descriptions, generate incorrect formats, violate length or sample count constraints, and be unfaithful to the input text. Existing benchmarks are insufficient to assess LLMs' ability to understand complex instructions, as they are close-ended and simple. To bridge this gap, we propose CELLO, a benchmark for evaluating LLMs' ability to follow complex instructions systematically. We design eight features for complex instructions and construct a comprehensive evaluation dataset from real-world scenarios. We also establish four criteria and develop corresponding metrics, as current ones are inadequate, biased or too strict and coarse-grained. We compare the performance of representative Chinese-oriented and English-oriented models in following complex instructions through extensive experiments. Resources of CELLO are publicly available at https://github.com/Abbey4799/CELLO

arXiv.org e-Print Archive

Genetic Properties of a Nested Association Mapping Population Constructed With Semi-Winter and Spring Oilseed Rapes

Author: Bo Wang
Chaocheng Guo
Graham J. King
Haitao Li
Jianlin Hu
Jiaqing Ye
Kede Liu
Meng Liu
Qinghua Zhang
Yingjie Xiao
Zhikun Wu
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Nested association mapping (NAM) populations have been widely applied to dissect the genetic basis of complex quantitative traits in a variety of crops. In this study, we developed a Brassica napus NAM (BN-NAM) population consisting of 15 recombination inbred line (RIL) families with 2,425 immortal genotypes. Fifteen high-density genetic linkage maps were constructed by genotyping by sequencing (GBS) based on all RIL families, with further integration into a joint linkage map (JLM) having 30,209 unique markers in common with multiple linkage maps. Furthermore, an ultra-density whole-genome variation map was constructed by projecting 4,444,309 high-quality variants onto the JLM. The NAM population captured a total of 88,542 recombination events (REs). The uneven distribution of recombination rate along chromosomes is positively correlated with the densities of genes and markers, but negatively correlated with the density of transposable elements and linkage disequilibrium (LD). Analyses of population structure and principal components revealed that the BN-NAM population could be divided into three groups with weak stratification. The LD decay distance across genome varied between 170 and 2,400 Kb, with LD decay more rapid in the A than in the C sub-genome. The pericentromeric regions contained large LD blocks, especially in the C sub-genome. This NAM population provides a valuable resource for dissecting the genetic basis of important traits in rapeseed, especially in semi-winter oilseed rape

Directory of Open Access Journals

Frontiers - Publisher Connector

Assouad Dimensions and Lower Dimensions of Some Moran Sets

Author: JiaQing Xiao
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2020
Field of study

We prove that the low dimensions of a class of Moran sets coincide with their Hausdorff dimensions and obtain a formula for the lower dimensions. Subsequently, we consider some homogeneous Cantor sets which belong to Moran sets and give the counterexamples in which their Assouad dimension is not equal to their upper box dimensions and packing dimensions under the case of not satisfying the condition of the smallest compression ratio c∗>0

Directory of Open Access Journals

Multifractal Structure of the Divergence Points of Some Homogeneous Moran Measures

Author: JiaQing Xiao
YouMing He
Publication venue: Hindawi Limited
Publication date: 01/01/2014
Field of study

The point x for which the limit limr→0⁡(log⁡μBx,r/log⁡r) does not exist is called divergence point. Recently, multifractal structure of the divergence points of self-similar measures has been investigated by many authors. This paper is devoted to the study of some Moran measures with the support on the homogeneous Moran fractals associated with the sequences of which the frequency of the letter exists; the Moran measures associated with this kind of structure are neither Gibbs nor self-similar and than complex. Such measures possess singular features because of the existence of so-called divergence points. By the box-counting principle, we analyze multifractal structure of the divergence points of some homogeneous Moran measures and show that the Hausdorff dimension of the set of divergence points is the same as the dimension of the whole Moran set

Crossref

Directory of Open Access Journals