Search CORE

77 research outputs found

MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning

Author: Shi Haizhou
Tang Siliang
Zhang Zhenshuo
Zhu Yun
Publication venue
Publication date: 24/07/2023
Field of study

In this work, we investigate the problem of out-of-distribution (OOD) generalization for unsupervised learning methods on graph data. This scenario is particularly challenging because graph neural networks (GNNs) have been shown to be sensitive to distributional shifts, even when labels are available. To address this challenge, we propose a \underline{M}odel-\underline{A}gnostic \underline{R}ecipe for \underline{I}mproving \underline{O}OD generalizability of unsupervised graph contrastive learning methods, which we refer to as MARIO. MARIO introduces two principles aimed at developing distributional-shift-robust graph contrastive methods to overcome the limitations of existing frameworks: (i) Information Bottleneck (IB) principle for achieving generalizable representations and (ii) Invariant principle that incorporates adversarial data augmentation to obtain invariant representations. To the best of our knowledge, this is the first work that investigates the OOD generalization problem of graph contrastive learning, with a specific focus on node-level tasks. Through extensive experiments, we demonstrate that our method achieves state-of-the-art performance on the OOD test set, while maintaining comparable performance on the in-distribution test set when compared to existing approaches. The source code for our method can be found at: https://github.com/ZhuYun97/MARIOComment: 20 pages, 15 figure

arXiv.org e-Print Archive

xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark

Author: D'Haro Luis Fernando
Li Haizhou
Shi Ke
Tang Chengguang
Tang Guohua
Zhang Chen
Publication venue
Publication date: 13/10/2023
Field of study

Recent advancements in reference-free learned metrics for open-domain dialogue evaluation have been driven by the progress in pre-trained language models and the availability of dialogue data with high-quality human annotations. However, current studies predominantly concentrate on English dialogues, and the generalization of these metrics to other languages has not been fully examined. This is largely due to the absence of a multilingual dialogue evaluation benchmark. To address the issue, we introduce xDial-Eval, built on top of open-source English dialogue evaluation datasets. xDial-Eval includes 12 turn-level and 6 dialogue-level English datasets, comprising 14930 annotated turns and 8691 annotated dialogues respectively. The English dialogue data are extended to nine other languages with commercial machine translation systems. On xDial-Eval, we conduct comprehensive analyses of previous BERT-based metrics and the recently-emerged large language models. Lastly, we establish strong self-supervised and multilingual baselines. In terms of average Pearson correlations over all datasets and languages, the best baseline outperforms OpenAI's ChatGPT by absolute improvements of 6.5% and 4.6% at the turn and dialogue levels respectively, albeit with much fewer parameters. The data and code are publicly available at https://github.com/e0397123/xDial-Eval.Comment: Accepted to EMNLP-2023 Finding

arXiv.org e-Print Archive

The Spatial and Temporal Dynamics of Rabies in China

Author: Adams James
Fang Wei
Guo Zhenyang
Han Na
Li Hao
Liang Guodong
Liu Haizhou
Rayner Simon
Tang Qing
Tao Xiaoyan
Wang Shumei
Yu Jinning
Publication venue: Public Library of Science
Publication date: 01/05/2012
Field of study

Rabies is a major problem in developing countries and responsible for more than 55,000 deaths annually. More than half of the cases occur in Asia and China has the second highest incidence of rabies after India. Human rabies cases in China decreased during the early 1990s but the virus began to re-emerge in the latter half of the decade and spread rapidly across the country with a corresponding increase in cases. To try and learn more about the epidemic, in 2006 the government implemented a trial surveillance program to sample and screen canine populations in locations where human cases were reported. In this work we selected a subset of samples (representative of the entire epidemic region) for sequencing and investigated the history and origin of the virus in China and examined the variation from a geographical perspective. Our results indicate that the epidemic is primarily composed of a younger strain with a geographical dispersion that was consistent with the recorded spread of the virus and a second older strain that corresponds to a previous epidemic. This second group exhibits a different geographical pattern, and it appears that this strain remained at low levels throughout the country and was able to re-emerge as the epidemic took hold

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Comparative genomics of Toll-like receptor signalling in five species

Author: Ait-ali Tahar
Anderson Susan I
Archibald Alan L
Cockett Noelle E
Corrales Nestor Lopez
Glass Elizabeth J
Jann Oliver C
Jensen Kirsty
King Annemarie
Tang Haizhou
Wu Chunhua
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Over the last decade, several studies have identified quantitative trait loci (QTL) affecting variation of immune related traits in mammals. Recent studies in humans and mice suggest that part of this variation may be caused by polymorphisms in genes involved in Toll-like receptor (TLR) signalling. In this project, we used a comparative approach to investigate the importance of TLR-related genes in comparison with other immunologically relevant genes for resistance traits in five species by associating their genomic location with previously published immune-related QTL regions. Results We report the genomic localisation of <it>TLR1-10 </it>and ten associated signalling molecules in sheep and pig using <it>in-silico </it>and/or radiation hybrid (RH) mapping techniques and compare their positions with their annotated homologues in the human, cattle and mouse whole genome sequences. We also report medium-density RH maps for porcine chromosomes 8 and 13. A comparative analysis of the positions of previously published relevant QTLs allowed the identification of homologous regions that are associated with similar health traits in several species and which contain TLR related and other immunologically relevant genes. Additional evidence was gathered by examining relevant gene expression and association studies. Conclusion This comparative genomic approach identified eight genes as potentially causative genes for variations of health related traits. These include susceptibility to clinical mastitis in dairy cattle, general disease resistance in sheep, cattle, humans and mice, and tolerance to protozoan infection in cattle and mice. Four TLR-related genes (<it>TLR1</it>, <it>6</it>, <it>MyD88</it>, <it>IRF3</it>) appear to be the most likely candidate genes underlying QTL regions which control the resistance to the same or similar pathogens in several species. Further studies are required to investigate the potential role of polymorphisms within these genes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Academica-e

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve sub-systems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others, a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation.Comment: 5 page

arXiv.org e-Print Archive

HAL AMU

INRIA a CCSD electronic archive server

Hal-Diderot

A genetic variation map for chicken with 2.8 million single-nucleotide polymorphisms

Author: Aerts Andrea
Andersson Björn
Andersson Leif
Bartley Neil
Boardman Paul E
Bovenhuis Henk
Brandström Mikael
Bumstead Nat
Burt David W
Chen Chen
Chen Jie
Cheng Hans H
Consortium International Chicken Polymorphism Map
Crooijmans Richard P M A
Dai Mingtao
de Koning Dirk-Jan
Dong Le
Dong Wei
Ellegren Hans
Glavina Tijana
Gordon Laurie
Groenen Martien A M
Gunnarsson Ulrika
Hao Bailin
He Dandan
He Ximiao
Hillier Ladeana W
Hocking Paul M
Hu Songnian
Huang Xiangang
Huang Yanqing
Hubbard Simon J
Hunt Henry
Kaiser Pete
Kaufman Jim
Kindlund Ellen
Lamont Susan J
Lan Fengdi
Law Andy
Li Dawei
Li Guangyuan
Li Guoqing
Li Heng
Li Jun
Li Ning
Li Ruiqiang
Li Shengting
Li Songgang
Li Wenjie
Li Yuanzhe
Lin Wei
Liu Bin
Lucas Susan
Meng Qingshun
Morrice David
Ni Peixiang
Ovcharenko Ivan
Overton Ian M
Ponting Chris
Qi Qiuhui
Ran Longhua
Rogers Sally
Rothwell Lisa
Ruan Jue
Shi Jianping
Stubbs Lisa
Sun Yongqiao
Tammi Martti T
Tang Haizhou
Tong Wei
van der Poel Jan J
van Hateren Andy
Wahlberg Per
Walker Brian A
Wang Jian
Wang Jianjun
Wang Jing
Wang Jun
Wang Miaoheng
Wang Pei
Wang Xiaoling
Warren Wesley C
Webber Caleb
Wei Dong Qing
Wei Ning
Wilson Richard K
Wilson Stuart A
Wong Gane Ka-Shu
Xi Yan
Xie Fei
Yang Huanming
Yang Ning
Yang Shiaw-Pyng
Yang Xu
Yang Zheng
Ye Chen
Ye Jia
Young John R
Yu Jun
Yu Yingpu
Zeng Changqing
Zhang Jianguo
Zhang Jingjing
Zhang Xiaowei
Zhang Yunze
Zhang Zengjin
Zhang Zhenpeng
Zhang Zhi-Yong
Zhao Wenming
Zhao Yiqiang
Zheng Hongkun
Zheng Weimou
Zhou Huaijun
Zhou Jun
Zhou Yan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

We describe a genetic variation map for the chicken genome containing 2.8 million single-nucleotide polymorphisms ( SNPs). This map is based on a comparison of the sequences of three domestic chicken breeds ( a broiler, a layer and a Chinese silkie) with that of their wild ancestor, red jungle fowl. Subsequent experiments indicate that at least 90% of the variant sites are true SNPs, and at least 70% are common SNPs that segregate in many domestic breeds. Mean nucleotide diversity is about five SNPs per kilobase for almost every possible comparison between red jungle fowl and domestic lines, between two different domestic lines, and within domestic lines - in contrast to the notion that domestic animals are highly inbred relative to their wild ancestors. In fact, most of the SNPs originated before domestication, and there is little evidence of selective sweeps for adaptive alleles on length scales greater than 100 kilobases

Queen's University Belfast Research Portal

Edinburgh Research Explorer

Wageningen University & Research Publications

University of Gloucestershire Research Repository

The University of Manchester - Institutional Repository

University of Queensland eSpace

Digital Repository @ Iowa State University (ISU)

Online Research @ Cardiff

Oxford University Research Archive