Search CORE

239 research outputs found

AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications

Author: Cao Yunbo
Li Chao
Li Zhongli
Liu Hongzhi
Liu Ziyi
Ma Mina
Zhang Yusen
Zhou Qingyu
Publication venue
Publication date: 16/09/2022
Field of study

To automatically correct handwritten assignments, the traditional approach is to use an OCR model to recognize characters and compare them to answers. The OCR model easily gets confused on recognizing handwritten Chinese characters, and the textual information of the answers is missing during the model inference. However, teachers always have these answers in mind to review and correct assignments. In this paper, we focus on the Chinese cloze tests correction and propose a multimodal approach (named AiM). The encoded representations of answers interact with the visual information of students' handwriting. Instead of predicting 'right' or 'wrong', we perform the sequence labeling on the answer text to infer which answer character differs from the handwritten content in a fine-grained way. We take samples of OCR datasets as the positive samples for this task, and develop a negative sample augmentation method to scale up the training data. Experimental results show that AiM outperforms OCR-based methods by a large margin. Extensive studies demonstrate the effectiveness of our multimodal approach.Comment: Accepted to COLING 202

arXiv.org e-Print Archive

Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

Author: Anantharaman Thomas
Andrews Warren
Cao Dandan
Cao Han
Cao Hongzhi
Chan Saki
Hastie Alex R.
Huang Haodong
Huang Shujia
Krogh Anders
Lam Ernest T.
Lin Liya
Liu Xiao
Requa Michael
Sun Yuhui
Tong Xin
Xu Xun
Yang Huanming
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger than 1 kb. Excluding the 59 SVs (54 insertions/deletions, 5 inversions) that overlap with N-base gaps in the reference assembly hg19, 666 non-gap SVs remained, and 396 of them (60%) were verified by paired-end data from whole-genome sequencing-based re-sequencing or de novo assembly sequence from fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides valuable information for complex regions with haplotypes in a straightforward fashion. In addition, with long single-molecule labeling patterns, exogenous viral sequences were mapped on a whole-genome scale, and sample heterogeneity was analyzed at a new level. CONCLUSION: Our study highlights genome mapping technology as a comprehensive and cost-effective method for detecting structural variation and studying complex regions in the human genome, as well as deciphering viral integration into the host genome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2047-217X-3-34) contains supplementary material, which is available to authorized users

Springer - Publisher Connector

Copenhagen University Research Information System

PubMed Central

Case report of a Li-Fraumeni syndrome-like phenotype with a de novo mutation in <i>CHEK2</i>

Author: Cao Hongzhi
Chen Jianghao
He Chenyang
Huang Chen
Li Weiyang
Li Xinyang
Li Yongping
Lin Liya
Liu Jiayun
Liu Shuang
Liu Xiao
Lv Yonggang
Wang Ling
Wang Ting
Wang Zhen
Xu Xun
Ye Rui
Zhang Juliang
Zhuang Xuehan
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/01/2016
Field of study

BACKGROUND: Cases of multiple tumors are rarely reported in China. In our study, a 57-year-old female patient had concurrent squamous cell carcinoma, mucoepidermoid carcinoma, brain cancer, bone cancer, and thyroid cancer, which has rarely been reported to date. METHODS: To determine the relationship among these multiple cancers, available DNA samples from the thyroid, lung, and skin tumors and from normal thyroid tissue were sequenced using whole exome sequencing. RESULTS: The notable discrepancies of somatic mutations among the 3 tumor tissues indicated that they arose independently, rather than metastasizing from 1 tumor. A novel deleterious germline mutation (chr22:29091846, G->A, p.H371Y) was identified in CHEK2, a Li–Fraumeni syndrome causal gene. Examining the status of this novel mutation in the patient's healthy siblings revealed its de novo origin. CONCLUSION: Our study reports the first case of Li–Fraumeni syndrome-like in Chinese patients and demonstrates the important contribution of de novo mutations in this type of rare disease

Copenhagen University Research Information System

PubMed Central

The Genome of the Netherlands:design, and project goals

Author: Abdellaoui Abdel
Beekman Marian
Boomsma Dorret I.
Byelas Heorhiy
Cao Hongzhi
Cao Sujie
Chen Ruoyan
de Bakker Paul I. W.
de Craen Anton J. M.
de Knijff Peter
Deelen Patrick
den Dunnen Johan T.
Dijkstra Martijn
Du Yuanping
Elbers Clara C.
Estrada Karol
Francioli Laurent C.
Guryev Victor
Hehir-Kwa Jayne Y.
Hofman Albert
Hottenga Jouke Jan
Houwing-Duistermaat Jeanine
Kanterakis Alexandros
Karssen Lennart C.
Kattenberg Mathijs
Koval Vyacheslav
Laros Jeroen F. J.
Li Ning
Li Qibin
Li Yingrui
Mai Hailiang
Menelaou Androniki
Neerincx Pieter B. T.
Oostra Ben
Pulit Sara L.
Rivadeneira Fernanodo
Slagboom Eline P.
Suchiman Eka H. D.
Swertz Morris A.
Uitterlinden Andre G.
van Dijk Freerk
van Duijn Cornelia M.
van Enckevort David
van Leeuwen Elisabeth M.
van Ommen Gert-Jan
van Setten Jessica
Vermaat Martijn
Wang Jun
Wijmenga Cisca
Willemsen Gonneke
Wolffenbuttel Bruce H.
Ye Kai
Publication venue
Publication date: 29/05/2013
Field of study

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean = 53 years; SD = 16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.</p

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Copenhagen University Research Information System

Dissertations of the University of Groningen

Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale

Author: Albrechtsen Anders
Andreas Sibbesen Jonas
Belling Kirstine González-Izarzugaza
Besenbacher Søren
Bolund Lars
Bork-Jensen Jette
Brunak Søren
Børglum Anders D.
Cao Hongzhi
Chang Yuqi
Cheng Xiaofang
Dam-Als Thomas
Demontis Ditte
Eiberg Hans
Friborg Rune M.
Gonzalez-Izarzugaza Jose Maria
Grove Jakob
Guo Xiaosen
Gupta Ramneek
Hansen Torben
Huang Shujia
Jiang Hui
Kristiansen Karsten
Krogh Anders
Lescai Francesco
Li Ning
Li Shengting
Liu Hao
Liu Siyang
Liu Siyang
Lund Ole
Mailund Thomas
Maretty Lasse
Nørgaard Flindt Esben
Pedersen Christian N. S.
Pedersen Oluf
Rao Junhua
Rasmussen Simon
Schierup Mikkel H.
Sun Jihua
Sørensen Thorkild I. A.
Theil Have Christian
Villesen Palle
Wang Jun
Wang Ou
Xu Ruiqi
Xu Xun
Yadav Rachita
Ye Chen
Ye Weijian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Online Research Database In Technology

The Genome of the Netherlands: Design, and project goals

Author: Abdellaoui A. (Abdel)
Bakker P.I.W. (Paul) de
Beekman M. (Marian)
Boomsma D.I. (Dorret)
Byelas H. (Heorhiy)
Cao H. (Hongzhi)
Cao S. (Sherry)
Chen R. (Ruoyan)
Craen A.J. (Anton) de
Deelen P. (Patrick)
Dijk F. (Freerk) van
Dijkstra M. (Martijn)
Du Y. (Yangchun)
Duijn C.M. (Cornelia) van
Dunnen J.T. (Johan) den
Elbers C.C. (Clara)
Enckevort D. (David) van
Estrada Gil K. (Karol)
Francioli L.C. (Laurent)
Guryev V. (Victor)
Hehir-Kwa J. (Jayne)
Hofman A. (Albert)
Hottenga J.J. (Jouke Jan)
Houwing-Duistermaat J.J. (Jeanine)
Kanterakis A. (Alexandros)
Karssen L.C. (Lennart)
Kattenberg V.M. (Mathijs)
Knijff P. (Peter) de
Koval V. (Vyacheslav)
Laros J.F.J. (Jeroen F.)
Leeuwen E.M. (Elisa) van
Li N. (Ning)
Li Q. (Qibin)
Li Y. (Yingrui)
Mai H. (Hailiang)
Menelaou A. (Androniki)
Neerincx P.B.T. (Pieter B T)
Ommen G.J. (Gert) van
Oostra B.A. (Ben)
Pulit S.L. (Sara)
Rivadeneira Ramirez F. (Fernando)
Setten J. (Jessica) van
Slagboom P.E. (Eline)
Suchiman H.E.D. (Eka)
Swertz M. (Morris)
Uitterlinden A.G. (André)
Vermaat J.S. (Joost)
Wang J. (Jinxia)
Wijmenga C. (Cisca)
Willemsen G.A.H.M. (Gonneke)
Wolffenbuttel B.H.R. (Bruce)
Ye K. (Kai)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2014
Field of study

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project

Erasmus University Digital Repository

Novel loci and pathways significantly associated with longevity

Author: Bae Harold
Bolund Lars
Cao Hongzhi
Chen Huashuai
Chi Li-Qing
Christensen Kaare
Christiansen Lene
Deelen Joris
Dong Jie
Franceschi Claudio
Gottschalk William
Gregory Simon
Gu Jun
Hauser Elizabeth
Land Kenneth C.
Li Jianxin
Li Mengmeng
Li Yang
Li Zhaochun
Lin Li
Liu Xiao
Liu Xiaomin
Lu Jiehua
Lutz Michael W.
Min Junxia
Ni Ting
Nie Chao
Nygaard Marianne
Perls Thomas
Qi Ming
Qi Yanwei
Qian Feng
Sebastiani Paola
Slagboom Eline
Song Chun
Tan Qihua
Tao Wei
Tian Xiao-Li
Vaupel James W.
Wang Han-Ming
Wang Jian
Wang Jun
Wang Mingbang
Wang Yan
Wang Yinghao
Xu Hanshi
Xu Huji
Xu Xun
Yan Han
Yang Huanming
Yang Ze
Yashin Anatoliy
Zeng Yi
Zhang Jin-Pei
Zhang Lijuan
Zheng Gu-Yan
Zhou Yufeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Only two genome-wide significant loci associated with longevity have been identified so far, probably because of insufficient sample sizes of centenarians, whose genomes may harbor genetic variants associated with health and longevity. Here we report a genome-wide association study (GWAS) of Han Chinese with a sample size 2.7 times the largest previously published GWAS on centenarians. We identified 11 independent loci associated with longevity replicated in Southern-Northern regions of China, including two novel loci (rs2069837-IL6; rs2440012-ANKRD20A9P) with genome-wide significance and the rest with suggestive significance (P < 3.65 × 10(−5)). Eight independent SNPs overlapped across Han Chinese, European and U.S. populations, and APOE and 5q33.3 were replicated as longevity loci. Integrated analysis indicates four pathways (starch, sucrose and xenobiotic metabolism; immune response and inflammation; MAPK; calcium signaling) highly associated with longevity (P ≤ 0.006) in Han Chinese. The association with longevity of three of these four pathways (MAPK; immunity; calcium signaling) is supported by findings in other human cohorts. Our novel finding on the association of starch, sucrose and xenobiotic metabolism pathway with longevity is consistent with the previous results from Drosophilia. This study suggests protective mechanisms including immunity and nutrient metabolism and their interactions with environmental stress play key roles in human longevity

Copenhagen University Research Information System

PubMed Central

University of Southern Denmark Research Output

Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion

Author: Als Thomas D.
Andreas Sibbesen Jonas
Belling Kirstine González-Izarzugaza
Besenbacher Søren
Bolund Lars
Bork-Jensen Jette
Brunak Søren
Børglum Anders D.
Cao Hongzhi
Chang Yuqi
Eiberg Hans
Espeseth Thomas
Flindt Esben
Friborg Rune M.
Gonzalez-Izarzugaza Jose Maria
Gonzalez-Izarzugaza Jose Maria
Grosjean Marie
Grove Jakob
Guo Xiaosen
Gupta Ramneek
Halager Anders E.
Hansen Torben
Huang Shujia
Hultman Christina M.
Jensen Jacob Malte
Kristiansen Karsten
Krogh Anders
Le Hellard Stephanie
Lescai Francesco
Li Ning
Li Shengting
Liu Siyang
Lund Ole
Løngren Peter
Mailund Thomas
Maretty Lasse
Matey-Hernandez María Luisa
Mors Ole
Pedersen Christian N. S.
Pedersen Oluf
Petersen Bent
Qaswar Ali Shah Syed
Rao Junhua
Rasmussen Simon
Schierup Mikkel Heide
Sicheritz-Pontén Thomas
Skov Laurits
Sullivan Patrick F.
Sun Jihua
Sørensen Thorkild I. A.
Theil Have Christian
van Beusekom Johan
Villesen Palle
Wang Jun
Westergaard David
Xu Ruiqi
Xu Xun
Yadav Rachita
Ye Chen
Ye Weijian
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

Online Research Database In Technology

Benchmarking the HLA typing performance of Polysolver and Optitype in 50 Danish parental trios

Author: Ali Syed
Als Thomas D.
Andreas Sibbesen Jonas
Belling Kirstine
Besenbacher Søren
Beusekom Johan v.
Bolund Lars
Bork-Jensen Jette
Brunak Søren
Brunak Søren
Børglum Anders D.
Cao Hongzhi
Chang Yuqi
Eiberg Hans
Espeseth Thomas
Flindt Esben N.
Friborg Rune M.
Gonzalez-Izarzugaza Jose Maria
Grosjean Marie
Grove Jakob
Guo Xiaosen
Gupta Ramneek
Halager Anders Egerup
Hansen Torben
Huang Shujia
Hultman Christina M.
Izarzugaza Jose M. G.
Jensen Jacob Malte
Kristiansen Karsten
Krogh Anders
Le Hellard Stephanie
Lescai Francesco
Li Ning
Li Shengting
Liu Siyang
Lund Ole
Løngren Peter
Mailund Thomas
Maretty Lasse
Matey-Hernandez María Luisa
Matey-Hernandez María Luisa
Mors Ole
Pedersen Christian N. S.
Pedersen Oluf
Petersen Bent
Rao Junhua
Schierup Mikkel Heide
Sicheritz-Pontén Thomas
Skov Laurits
Sullivan Patrick F.
Sun Jihua
Sørensen Thorkild I. A.
Theil Have Christian
Villesen Palle
Wang Jun
Westergaard David
Xu Ruiqi
Xu Xun
Yadav Rachita
Ye Chen
Ye Weijian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Online Research Database In Technology

A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

Author: Abdellaoui A. (Abdel)
Amin N. (Najaf)
Baaijens J.A. (Jasmijn)
Bakker P.I.W. (Paul) de
Beekman M. (Marian)
Boomsma D.I. (Dorret)
Bot J. (Jan)
Bovenberg J.A. (Jasper)
Byelas G. (George)
Cao H. (Hongzhi)
Cao J.S. (Jeremy Sujie)
Cao R. (Rui)
Chen R. (Ruoyan)
Coe B.P. (Bradley)
Craen A.J.M. (Anton) de
Deelen P. (Patrick)
Dijk F. (Freerk) van
Dijkstra L.J. (Louis)
Dijkstra M. (Martijn)
Du Y. (Yuanping)
Duijn C.M. (Cornelia) van
Dunnen J.T. (Johan) den
Eichler E.E. (Evan)
Enckevort D. (David) van
Estrada K. (Karol)
Francioli L.C. (Laurent)
Guryev V. (Victor)
Handsaker R.E. (Robert)
Hehir-Kwa J.Y. (Jayne)
Hofman A. (Albert)
Hormozdiari F. (Fereydoun)
Hottenga J.-J. (Jouke-Jan)
Kanterakis A. (Alexandros)
Karssen L.C. (Lennart)
Kattenberg V.M. (Mathijs)
Kloosterman W.P. (Wigard)
Knijff P. (Peter) de
Ko A. (Arthur)
Koval V. (Vyacheslav)
Lameijer E.-W. (Eric-Wubbo)
Laros J.F.J. (Jeroen)
Ligt J. (Joep) de
Marschall T. (Tobias)
McCarroll S.A. (Steven)
Mei H. (Hailiang)
Neerincx P.B.T. (Pieter)
Nijman I.J. (Isaac)
Ommen G.-J.B. (Gert-Jan) van
Platteel M. (Mathieu)
Renkens I. (Ivo)
Rivadeneira F. (Fernando)
Santcroos M. (Mark)
Schaik B.D.C. (Barbera) van
Schönhuth A. (Alexander)
Slagboom P.E. (Eline)
Sudmant P. (Peter)
Sun Y. (Yushen)
Swertz M.A. (Morris)
Thung (), D.T. (Djie Tjwan)
Uitterlinden A.G. (André)
van Leeuwen E.M. (Elisa)
Vermaat M. (Martijn)
Wardenaar R. (René)
Wijmenga C. (Cisca)
Willemsen G. (Gonneke)
Wolffenbuttel B. (Bruce)
Ye K. (Kai)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/10/2016
Field of study

Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation. Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100 bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals

CWI's Institutional Repository