Search CORE

185 research outputs found

AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications

Author: Cao Yunbo
Li Chao
Li Zhongli
Liu Hongzhi
Liu Ziyi
Ma Mina
Zhang Yusen
Zhou Qingyu
Publication venue
Publication date: 16/09/2022
Field of study

To automatically correct handwritten assignments, the traditional approach is to use an OCR model to recognize characters and compare them to answers. The OCR model easily gets confused on recognizing handwritten Chinese characters, and the textual information of the answers is missing during the model inference. However, teachers always have these answers in mind to review and correct assignments. In this paper, we focus on the Chinese cloze tests correction and propose a multimodal approach (named AiM). The encoded representations of answers interact with the visual information of students' handwriting. Instead of predicting 'right' or 'wrong', we perform the sequence labeling on the answer text to infer which answer character differs from the handwritten content in a fine-grained way. We take samples of OCR datasets as the positive samples for this task, and develop a negative sample augmentation method to scale up the training data. Experimental results show that AiM outperforms OCR-based methods by a large margin. Extensive studies demonstrate the effectiveness of our multimodal approach.Comment: Accepted to COLING 202

arXiv.org e-Print Archive

Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters

Author: Chen Shaoshen
Huang Haojing
Jiang Yong
Li Yangning
Li Yinghui
Li Zhongli
Shen Ying
Xu Zishan
Zheng Hai-Tao
Zhou Qingyu
Publication venue
Publication date: 19/11/2023
Field of study

Writing assistance is an application closely related to human life and is also a fundamental Natural Language Processing (NLP) research field. Its aim is to improve the correctness and quality of input texts, with character checking being crucial in detecting and correcting wrong characters. From the perspective of the real world where handwriting occupies the vast majority, characters that humans get wrong include faked characters (i.e., untrue characters created due to writing errors) and misspelled characters (i.e., true characters used incorrectly due to spelling errors). However, existing datasets and related studies only focus on misspelled characters mainly caused by phonological or visual confusion, thereby ignoring faked characters which are more common and difficult. To break through this dilemma, we present Visual-C

^3

, a human-annotated Visual Chinese Character Checking dataset with faked and misspelled Chinese characters. To the best of our knowledge, Visual-C

^3

is the first real-world visual and the largest human-crafted dataset for the Chinese character checking scenario. Additionally, we also propose and evaluate novel baseline methods on Visual-C

^3

. Extensive empirical results and analyses show that Visual-C

^3

is high-quality yet challenging. The Visual-C

^3

dataset and the baseline methods will be publicly available to facilitate further research in the community.Comment: Work in progres

arXiv.org e-Print Archive

A Semi-Analytical Model for the Formation and Evolution of Radio Relics in Galaxy Clusters

Author: Fan Shida
Hao Lei
Ji Li
Shan Chenxi
Xu Haiguang
Zhang Zhongli
Zhao Yuanyuan
Zheng Xianzhong
Zhou Yihao
Zhu Yongkai
Zhu Zhenghao
Publication venue: 'Oxford University Press (OUP)'
Publication date: 20/09/2022
Field of study

Radio relics are Mpc-sized synchrotron sources located in the peripheral regions of galaxy clusters. Models based on the diffuse shock acceleration (DSA) scenario have been widely accepted to explain the formation of radio relics. However, a critical challenge to these models is that most observed shocks seem too weak to generate detectable emission, unless fossil electrons, a population of mildly energetic electrons that have been accelerated previously, are included in the models. To address this issue, we present a new semi-analytical model to describe the formation and evolution of radio relics by incorporating fossil relativistic electrons into DSA theory, which is constrained by a sample of 14 observed relics, and employ the Press-Schechter formalism to simulate the relics in a

20^{\circ} \times 20^{\circ}

sky field at 50, 158, and 1400 MHz, respectively. Results show that fossil electrons contribute significantly to the radio emission, which can generate radiation four orders of magnitude brighter than that solely produced by thermal electrons at 158 MHz, and the power distribution of our simulated radio relic catalog can reconcile the observed

P_{1400}-M_{\mathrm{vir}}

relation. We predict that

7.1\%

clusters with

M_{\mathrm{vir}} > 1.2\times 10^{14}\,\mathrm{M}_{\odot}

would host relics at 158 MHz, which is consistent with the result of

10 \pm 6\%

given by the LoTSS DR2. It is also found that radio relics are expected to cause severe foreground contamination in future EoR experiments, similar to that of radio halos. The possibility of AGN providing seed fossil relativistic electrons is evaluated by calculating the number of radio-loud AGNs that a shock is expected to encounter during its propagation.Comment: 15 pages, 20 figures. Accepted for publication in MNRAS. Comments welcom

arXiv.org e-Print Archive

Genetic prediction of the causal relationship between schizophrenia and tumors: a Mendelian randomized study

Author: Changgang Sun
Changgang Sun
Liquan Wang
Qi Liu
Shihan Liu
Xiangning Cui
Xintong Zhou
Zhongli Sun
Publication venue: Frontiers Media S.A.
Publication date: 01/02/2024
Field of study

BackgroundPatients with schizophrenia are at a higher risk of developing cancer. However, the causal relationship between schizophrenia and different tumor types remains unclear.MethodsUsing a two-sample, two-way Mendelian randomization method, we used publicly available genome-wide association analysis (GWAS) aggregate data to study the causal relationship between schizophrenia and different cancer risk factors. These tumors included lung adenocarcinoma, lung squamous cell carcinoma, small-cell lung cancer, gastric cancer, alcohol-related hepatocellular cancer, tumors involving the lungs, breast, thyroid gland, pancreas, prostate, ovaries and cervix, endometrium, colon and colorectum, and bladder. We used the inverse variance weighting (IVW) method to determine the causal relationship between schizophrenia and different tumor risk factors. In addition, we conducted a sensitivity test to evaluate the effectiveness of the causality.ResultsAfter adjusting for heterogeneity, evidence of a causal relationship between schizophrenia and lung cancer risk was observed (odds ratio [OR]=1.001, 95% confidence interval [CI], 1.000–1.001; P=0.0155). In the sensitivity analysis, the causal effect of schizophrenia on the risk of lung cancer was consistent in both direction and degree. However, no evidence of causality or reverse causality between schizophrenia and other tumors was found.ConclusionThis study elucidated a causal relationship between the genetic predictors of schizophrenia and the risk of lung cancer, thereby providing a basis for the prevention, pathogenesis, and treatment of schizophrenia in patients with lung cancer

Directory of Open Access Journals

A Gossypium BAC clone contains key repeat components distinguishing sub-genome of allotetraploidy cottons

Author: A Skovsted
AH Paterson
C Alkan
Chunying Wang
CP Hong
DE Soltis
EV Ananiev
Fang Liu
FG Li
FG Li
GP Copenhaver
H Cheng
HI Choi
HY Gao
ISH Ho
J Fajkus
J Macas
JE Endrizzi
JF Wendel
JF Wendel
JF Wendel
JH Jong De
JL Bennetzen
JM Jiang
JM Jiang
JM Stewart
JM Webber
JO Beasley
JS Hawkins
JS Hawkins
JS Kim
JT Page
JY Sun
JZ Yu
K Wang
KB Wang
KB Wang
Kunbo Wang
L Yang
M Kishii
M Tamura
MA Lysak
O Kohany
O Kulikova
P Novak
P SanMiguel
PA Fryxell
RE Hanson
RE Hanson
Renhai Peng
S Kubis
SH Liu
T Seelanan
TZ Zhang
WX Liu
Xiaoyan Cai
Xinglei Cui
Xingxing Wang
XP Zhao
Y Yu
YM Gan
YM Gan
Yuhong Wang
Yuling Liu
Z Xu
Zhongli Zhou
Zhongxu Lin
ZK Cheng
ZN Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Author: :
Abdollahi Arezoo
Abdulmumin Idris
Abrar Nafis
Adelani David Ifeoluwa
Aghagol Arash
Aji Alham Fikri
Ajibade Benjamin
Akiki Christopher
Akinlolu Martha
Al-shaibani Maged S.
Albanie Samuel
Alfassy Amit
Alizadeh Samira
allal Loubna Ben
Almubarak Khalid
Altay Gabriel
Alyafeai Zaid
Ammanamanchi Pawan Sasanka
Amuok Priscilla
An Ran
Antverg Omer
Bach Stephen H.
Bajaj Yash Shailesh
Bamberger Zachary
Bari M Saiful
Barth Fabio
Baruwa Ahmed
Bawden Rachel
Baylor Emi
Bayrak Giyaseddin
Behroozi Bahareh
Beilharz Benjamin
Bekman Stas
Belinkov Yonatan
Belkada Younes
Bello Imane
Beltagy Iz
Ben-David Srulik
Benyamina Hamza
Bers Tali
Bharati Sushil
Bhattacharjee Joydeep
Bhattacharya Indrani
Biderman Stella
Bogdanov Eli
Bommasani Rishi
Bose Shamik
Bourfoune Hatim
Bras Mathilde
Brito Caio
Broad Nicholas Michio
Brody Shaked
Bulchandani Lokesh
Burns Gully
Burynok Mykola
Cahyawijaya Samuel
Callahan Alison
Canalli Rodrigo
Carpuat Marine
Casper Jared
Castagné Roman
Castillo Maria A
Chaffin Antoine
Chandrasekhar Ramya
Chang Jonathan
Chen Kimbo
Cheng Newton
Cheveleva Anastasia
Chhablani Gunjan
Chim Jenny
Chung Hyung Won
Clinciu Miruna
Clive Jordan
Coavoux Maximin
Colombo Pierre
Contractor Danish
Cornette Pierre
Cullan Michael
Dahlberg Nathan
Danchev Valentin
Dash Ishani
Datta Debajyoti
David Davis
de Bykhovetz Madeleine Hahn
de Gibert Ona
de la Rosa Javier
De Toni Francesco
De Wolf Michiel
del Moral Albert Villanova
Deshmukh Shlok S
Dettmers Tim
Dey Manan
Dodge Jesse
Dupont Gérard
Dutra Livia
Eisenberg Renata
Elbadri Maraim
Elkott Nour
Elsahar Hady
Emezue Chris
Espejel Omar
Fahmy Nour
Fan Angela
Faranak Amy
Feizpour Amir
Ferrandis Carlos Muñoz
Fevry Thibault
Forde Jessica Zosa
Fourrier Clémentine
Freidank Moritz
Fries Jason Alan
Frohberg Jörg
Fuhrimann Florian
Fung Pascale
Gallé Matthias
Gandhi Sanchit
Gao Leo
Garda Samuele
Garrette Dan
Gehrmann Sebastian
Gerchick Marissa
Ghaleb Mustafa
Ghauri Muhammed
Gigant Théo
Giorgi John
Gokaslan Aaron
Golde Jonas
Gonzalez-Dios Itziar
Grandury María
HajiHosseini Azadeh
Haller Patrick
Hao Ryan
Harliman Rheza
Hazan Liam
Heinzerling Benjamin
Henderson Peter
Hesslow Daniel
Hevia Anthony
Huang Max
Ilić Suzana
Jain Chirag
Jauhar Mohammad A.
Jernite Yacine
Jiang Mike Tian-Jian
Johnson Isaac
Jones Hessie
Kainuma Tomoya
Kalo Jan-Christoph
Kang Jihyun
Kang Myungsun
Kasai Jungo
Kashyap Abhinav Ramesh
Kasner Zdeněk
Kassner Nora
Kawamura Ken
Khamis Nurulaqilla
Khan Ammar
Kiblawi Sid
Kiela Douwe
Kim Ethan
Kim Najoung
Kim Taewoon
Klamm Christopher
Kromann Rasmus
Kruszewski Germán
Kumar Srishti
Kusa Wojciech
Labrak Yanis
Lacroix Rémi
Laippala Veronika
Lansky David
Laud Tanmay
Launay Julien
Laurençon Hugo
Lavallée Pierre François
Le Thanh
Le Trieu
Lee Wilson Y.
Leong Colin
Lepercq Violette
Levkovizh Efrat
Lhoest Quentin
Li Conglong
Ligozat Anne-Laure
Limisiewicz Tomasz
Liu Lu
Liu Minna
Lo Kyle
Longpre Shayne
Lovering Charles
Luccioni Alexandra Sasha
López Roberto Luis
Manica Matteo
Manjavacas Enrique
Martin Robert
Masoud Maraim
McKenna Michael
McMillan-Major Angelina
Mielke Sabrina J.
Mieskes Margot
Mihaljcic Mina
Mikhailov Vladislav
Miranda-Escalada Antonio
Mirkin Shachar
Mirza Fatima
Mishra Mayank
Mishra Shubhanshu
Mitchell Margaret
Molano Daniel
Mou Chenghao
Muellner Nikolaus
Muennighoff Niklas
Muhammad Shamsuddeen Hassan
Muñoz Manuel Romero
Nagel Sebastian
Narayanan Deepak
Natan Eyal Bar
Nayak Nihal
Neeraj Trishala
Nejadgholi Isar
Nezhurina Marianna
Nguyen Duong A.
Nguyen Huu
Nguyen Olivier
Nguyen Zach
Nikoulina Vassilina
Nikpoor Somaieh
Nitzav Ariel Kreisberg
Novikova Jekaterina
Névéol Aurélie
Ononiwu Frankline
Osei Salomey
Ott Simon
Oyebade Tobi
Ozoani Ezinwanne
Pai Suhas
Pais Shani
Palasciano Alfredo
Pandey Harshit
Passmore Jesse
Patil Suraj
Patry Nicolas
Pavlick Ellie
Periñán Daniel León
Pestana Amanda
Peyrounette Myriam
Phan Long
Phang Jason
Pistilli Giada
Ponferrada Eduardo González
Posada Jose David
Prabhu Vrinda
Press Ofir
Protasov Vitaly
Pruksachatkun Yada
Pyysalo Sampo
Pàmies Marc
Qiu Mike
Radev Dragomir
Raffel Colin
Raja Arun
Rajani Nazneen
Rajbhandari Samyam
Rasley Jeff
Raunak Vikas
Reiter Ehud
Requena Stéphane
Rezanejad Habib
Ribeiro Rui
Rieser Verena
Roberts Adam
Rogers Anna
Roy Sourav
Rozen Jos
Rueda Alice
Rush Alexander M.
Ruwase Olatunji
Ryabinin Max
Sagot Benoît
Salesky Elizabeth
Samagaio Mairon
Samuel Olanrewaju
Samwald Matthias
Sang-aroonsiri Sinee
Sanh Victor
Sanseviero Omar
Santilli Andrea
Santos Ana
Sanz Julio Bonis
Saulnier Lucile
Saxena Bharat
Scao Teven Le
Schick Timo
Schoelkopf Hailey
Schweter Stefan
Scialom Thomas
Sedenko Irina
Seelam Natasha
Seltzer Josh
Serikov Oleg
Sharma Abheesht
Sharma Shanya
Shavrina Tatiana
Shen Sheng
Shinzato Luisa
Shoeybi Mohammad
Shubber Sarmad
Shukla Anima
Si Chenglei
Silberberg Stanislav
Simhi Adi
Singh Amanpreet
Singh Ayush
Singh Mayank
Sivaraman Karthik Rangasai
Smith Shaden
Solaiman Irene
Soroa Aitor
Stiegler Arnaud
Strobelt Hendrik
Su Rosaline
Su Ruisi
Suarez Pedro Ortiz
Subramani Nishant
Subramonian Arjun
Sun Zhiqing
Sutawika Lintang
Szczechla Eliza
Sänger Mario
Tae Jaesung
Takeuchi Maiko
Taktasheva Ekaterina
Talat Zeerak
Tammour Aycha
Tan Edward
Tan Samson
Tan Zhe
Tang Xiangru
Tanguy Ludovic
Tazi Nouamane
Taşar Davut Emre
Teehan Ryan
Thakker Urmish
Thrush Tristan
Tobing Joseph
Tojarieh Hadar
Torrent Tiago Timponi
Tow Jonathan
Tran Hieu
Tunuguntla Deepak
Unldreaj Antigona
Uri Yallow
van der Wal Oskar
van Strien Daniel
Venkatraman Yash
Viguier Sylvain
Villegas Paulo
Voloshina Ekaterina
von Platen Patrick
Von Werra Leandro
Vrabec Helena U.
Vu Minh Chien
Wang Bo
Wang Han
Wang Silas
Wang Thomas
Weber Leon
Webson Albert
Weinberg Michael
Winata Genta Indra
Wolf Thomas
Workshop BigScience
Xie Zhongli
Xu Canwen
Xu Chuxin
Xu Yifan
Xu Yingxin
Xu Yu
Yang Yoyo
Ye Zifan
Yong Zheng-Xin
Yu Dian
Yu Ian
Yun Tian
Yvon François
Zhang Minjia
Zhang Rui
Zhang Ruochen
Zhou Chenxi
Zhu Jian
Zink Sydney
Šaško Mario
Publication venue
Publication date: 10/12/2022
Field of study

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License

arXiv.org e-Print Archive

Time-dependent water permeation behavior of concrete under constant hydraulic pressure

Author: Fang Yonghao
Wang Zhongli
Zhou Yue
Publication venue: Elsevier
Publication date: 01/12/2008
Field of study

In the present work, a concrete permeability testing setup was designed to study the behavior of hydraulic concrete subjected to constant hydraulic pressure. The results show that when concrete is subjected to high enough constant hydraulic pressure, it will be permeated, and after it reaches its maximum permeation rate, the permeability coefficient will gradually decrease towards a stable value. A time-dependent model of permeability coefficient for concrete subjected to hydraulic pressure is proposed. It is indicated that the decrease of the permeability coefficient with permeation time conforms well to the negative-exponential decrease model

Elsevier - Publisher Connector

Directory of Open Access Journals

Memories of the Gold Foreign Exchange Market Based on a Moving V-Statistic and Wavelet-Based Multiresolution Analysis

Author: Bin Liu
Peng Zheng
Zhongli Zhou
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

Memory in finance is the foundation of a well-established forecasting model, and new financial theory research shows that the stochastic memory model depends on different time windows. To accurately identify the multivariate long memory model in the financial market, this paper proposes the concept of a moving V-statistic on the basis of a modified R/S method to determine whether the time series has a long-range dependence and subsequently to apply wavelet-based multiresolution analysis to study the multifractality of the financial time series to determine the initial data windows. Finally, we check the moving V-statistic estimation in wavelet analysis in the same condition; the paper selects the volatilities of the gold foreign exchange rates to evaluate the moving V-statistic. According to the results, the method of testing memory established in this paper can identify the breakpoint of the memories effectively. Furthermore, this method can provide support for forecasting returns in the financial market

Directory of Open Access Journals

Memories of the Gold Foreign Exchange Market Based on a Moving V

Author: Bin Liu
Peng Zheng
Zhongli Zhou
Publication venue: 'Hindawi Limited'
Publication date
Field of study

Crossref