Search CORE

63 research outputs found

Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation

Author: Krishnan Prashant
Shang Jingbo
Wang Yangkun
Wang Zilong
Publication venue
Publication date: 24/05/2023
Field of study

Recent advances of incorporating layout information, typically bounding box coordinates, into pre-trained language models have achieved significant performance in entity recognition from document images. Using coordinates can easily model the absolute position of each token, but they might be sensitive to manipulations in document images (e.g., shifting, rotation or scaling), especially when the training data is limited in few-shot settings. In this paper, we propose to further introduce the topological adjacency relationship among the tokens, emphasizing their relative position information. Specifically, we consider the tokens in the documents as nodes and formulate the edges based on the topological heuristics from the k-nearest bounding boxes. Such adjacency graphs are invariant to affine transformations including shifting, rotations and scaling. We incorporate these graphs into the pre-trained language model by adding graph neural network layers on top of the language model embeddings, leading to a novel model LAGER. Extensive experiments on two benchmark datasets show that LAGER significantly outperforms strong baselines under different few-shot settings and also demonstrate better robustness to manipulations

arXiv.org e-Print Archive

ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in Real-World User-AI Conversation

Author: Guo Yuxin
Lin Zi
Shang Jingbo
Tong Yongqi
Wang Yangkun
Wang Yujia
Wang Zihan
Publication venue
Publication date: 26/10/2023
Field of study

Despite remarkable advances that large language models have achieved in chatbots, maintaining a non-toxic user-AI interactive environment has become increasingly critical nowadays. However, previous efforts in toxicity detection have been mostly based on benchmarks derived from social media content, leaving the unique challenges inherent to real-world user-AI interactions insufficiently explored. In this work, we introduce ToxicChat, a novel benchmark based on real user queries from an open-source chatbot. This benchmark contains the rich, nuanced phenomena that can be tricky for current toxicity detection models to identify, revealing a significant domain difference compared to social media content. Our systematic evaluation of models trained on existing toxicity datasets has shown their shortcomings when applied to this unique domain of ToxicChat. Our work illuminates the potentially overlooked challenges of toxicity detection in real-world user-AI conversations. In the future, ToxicChat can be a valuable resource to drive further advancements toward building a safe and healthy environment for user-AI interactions

arXiv.org e-Print Archive

Predicting miRNA-disease associations based on multi-view information fusion

Author: Cao Yangkun
Fu Yuan
Sheng Nan
Wang Yan
Xie Xuping
Zhang Shuangquan
Publication venue
Publication date: 27/09/2022
Field of study

MicroRNAs (miRNAs) play an important role in various biological processes and their abnormal expression could lead to the occurrence of diseases. Exploring the potential relationships between miRNAs and diseases can contribute to the diagnosis and treatment of complex diseases. The increasing databases storing miRNA and disease information provide opportunities to develop computational methods for discovering unobserved disease-related miRNAs, but there are still some challenges in how to effectively learn and fuse information from multi-source data. In this study, we propose a multi-view information fusion based method for miRNA-disease association (MDA)prediction, named MVIFMDA. Firstly, multiple heterogeneous networks are constructed by combining the known MDAs and different similarities of miRNAs and diseases based on multi-source information. Secondly, the topology features of miRNAs and diseases are obtained by using the graph convolutional network to each heterogeneous network view, respectively. Moreover, we design the attention strategy at the topology representation level to adaptively fuse representations including different structural information. Meanwhile, we learn the attribute representations of miRNAs and diseases from their similarity attribute views with convolutional neural networks, respectively. Finally, the complicated associations between miRNAs and diseases are reconstructed by applying a bilinear decoder to the combined features, which combine topology and attribute representations. Experimental results on the public dataset demonstrate that our proposed model consistently outperforms baseline methods. The case studies further show the ability of the MVIFMDA model for inferring underlying associations between miRNAs and diseases

Aberystwyth Research Portal

PubMed Central

Identify the radiotherapy-induced abnormal changes in the patients with nasopharyngeal carcinoma

Author: Gang Yin
Huiling Zhang
Xiuxin Wang
Yangkun Luo
Yi Zhao
Yin Tian
Zixuan Fan
Publication venue: 'International Association of Physical Chemists (IAPC)'
Publication date: 01/01/2017
Field of study

Radiotherapy (RT) is the standard treatment for nasopharyngeal carcinoma, which often causes inevitable brain injury in the process of treatment. The majority of patients has no abnormal signal or density change of the conventional magnetic resonance imaging (MRI) and computed tomography (CT) examination in the long-term follow-up after radiation therapy. However, when there is a visible CT and conventional MR imaging changes, the damage often has been severe and lack of effective treatments, seriously influencing the prognosis of patients. Therefore, the present study aimed to investigate the abnormal changes in nasopharyngeal carcinoma (NPC) patients after RT. In the present study, we exploited the machine learning framework which contained two parts: feature extraction and classification to automatically detect the brain injury. Our results showed that the method could effectively identify the abnormal regions reduced by radiotherapy. The highest classification accuracy was 82.5 % in the abnormal brain regions. The parahippocampal gyrus was the highest accuracy region, which suggested that the parahippocampal gyrus could be most sensitive to radiotherapy and involved in the pathogenesis of radiotherapy-induced brain injury in NPC patients

Crossref

Directory of Open Access Journals

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Refined Edge Usage of Graph Neural Networks for Edge Prediction

Author: Gan Quan
Jin Jiarui
Song Xiang
Wang Yangkun
Wipf David
Yu Yong
Zhang Weinan
Zhang Zheng
Publication venue
Publication date: 25/12/2022
Field of study

Graph Neural Networks (GNNs), originally proposed for node classification, have also motivated many recent works on edge prediction (a.k.a., link prediction). However, existing methods lack elaborate design regarding the distinctions between two tasks that have been frequently overlooked: (i) edges only constitute the topology in the node classification task but can be used as both the topology and the supervisions (i.e., labels) in the edge prediction task; (ii) the node classification makes prediction over each individual node, while the edge prediction is determinated by each pair of nodes. To this end, we propose a novel edge prediction paradigm named Edge-aware Message PassIng neuRal nEtworks (EMPIRE). Concretely, we first introduce an edge splitting technique to specify use of each edge where each edge is solely used as either the topology or the supervision (named as topology edge or supervision edge). We then develop a new message passing mechanism that generates the messages to source nodes (through topology edges) being aware of target nodes (through supervision edges). In order to emphasize the differences between pairs connected by supervision edges and pairs unconnected, we further weight the messages to highlight the relative ones that can reflect the differences. In addition, we design a novel negative node-pair sampling trick that efficiently samples 'hard' negative instances in the supervision instances, and can significantly improve the performance. Experimental results verify that the proposed method can significantly outperform existing state-of-the-art models regarding the edge prediction task on multiple homogeneous and heterogeneous graph datasets.Comment: Pre-prin

arXiv.org e-Print Archive

New highly-anisotropic Rh-based Heusler compound for magnetic recording

Author: Agrestini Stefano
Borrmann Horst
Coey J. M. D.
Fecher Gerhard H.
Felser Claudia
Fu Chenguang
He Yangkun
Herrero-Martin Javier
Hu Zhiwei
Jha Ajay
Kroder Johannes
Manna Kaustuv
Pan Yu
Parkin S. S. P.
Schaefer Rudolf
Schnelle Walter
Skourski Yurii
Stamenov Plamen
Tjeng Liu Hao
Valvidares Manuel
Wang Xiao
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

The development of high-density magnetic recording media is limited by the superparamagnetism in very small ferromagnetic crystals. Hard magnetic materials with strong perpendicular anisotropy offer stability and high recording density. To overcome the difficulty of writing media with a large coercivity, heat assisted magnetic recording (HAMR) has been developed, rapidly heating the media to the Curie temperature Tc before writing, followed by rapid cooling. Requirements are a suitable Tc, coupled with anisotropic thermal conductivity and hard magnetic properties. Here we introduce Rh2CoSb as a new hard magnet with potential for thin film magnetic recording. A magnetocrystalline anisotropy of 3.6 MJm-3 is combined with a saturation magnetization of {\mu}0Ms = 0.52 T at 2 K (2.2 MJm-3 and 0.44 T at room-temperature). The magnetic hardness parameter of 3.7 at room temperature is the highest observed for any rare-earth free hard magnet. The anisotropy is related to an unquenched orbital moment of 0.42 {\mu}B on Co, which is hybridized with neighbouring Rh atoms with a large spin-orbit interaction. Moreover, the pronounced temperature-dependence of the anisotropy that follows from its Tc of 450 K, together with a high thermal conductivity of 20 Wm-1K-1, makes Rh2CoSb a candidate for development for heat assisted writing with a recording density in excess of 10 Tb/in2

arXiv.org e-Print Archive

Repositorium für Naturwissenschaften und Technik

MPG.PuRe

MEI Kodierung der frühesten Notation in linienlosen Neumen

Das Optical Neume Recognition Project (ONRP) hat die digitale Kodierung von musikalischen Notationszeichen aus dem Jahr um 1000 zum Ziel – ein ambitioniertes Vorhaben, das die Projektmitglieder veranlasste, verschiedenste methodische Ansätze zu evaluieren. Die Optical Music Recognition-Software soll eine linienlose Notation aus einem der ältesten erhaltenen Quellen mit Notationszeichen, dem Antiphonar Hartker aus der Benediktinerabtei St. Gallen (Schweiz), welches heute in zwei Bänden in der Stiftsbibliothek in St. Gallen aufbewahrt wird, erfassen. Aufgrund der handgeschriebenen, linienlosen Notation stellt dieser Gregorianische Gesang den Forscher vor viele Herausforderungen. Das Werk umfasst über 300 verschiedene Neumenzeichen und ihre Notation, die mit Hilfe der Music Encoding Initiative (MEI) erfasst und beschrieben werden sollen. Der folgende Artikel beschreibt den Prozess der Adaptierung, um die MEI auf die Notation von Neumen ohne Notenlinien anzuwenden. Beschrieben werden Eigenschaften der Neumennotation, um zu verdeutlichen, wo die Herausforderungen dieser Arbeit liegen sowie die Funktionsweise des Classifiers, einer Art digitalen Neumenwörterbuchs

Kölner UniversitätsPublikationsServer

FigShare

Genomic insights into divergence and dual domestication of cultivated allotetraploid cottons

Author: AE Percival
AH Paterson
AH Paterson
AL Price
AR Walker
Baoliang Zhou
Bin Han
Bingliang Liu
BL Browning
C Trapnell
Caiping Cai
Chunxiao Liu
CL Brubaker
D Falush
D Reich
Dan Xiang
David D. Fang
DH Huson
EL Lubbers
F Li
F Li
G Coppens d’Eeckenbrugge
G Wang
GA Churchill
GA Esbroeck Van
Gaofu Mei
H Nonogaki
H Ren
H Yin
Hao Gong
Hong Chen
Huaitong Wu
J Felsenstein
J Gou
J Schmutz
J Schwendiman
J Zhao
JB Hutchinson
JB Hutchinson
JC Barrett
JF Doebley
JF Wendel
JF Wendel
JF Wendel
Jiedan Chen
JM Chia
JM Lacape
JT Page
K Ye
L Du
L Zhu
Lei Fang
Lijing Chang
LK Mchale
M Wang
M Wang
M Yoo
MA Ohto
MB Hufford
Mengqiao Pan
N Li
OL May
OT Westengen
P Tyagi
Qiong Wang
Qun Wan
RC McGarry
RG Percy
RJ Kohel
S Myles
Sen Wang
SG Stephens
SG Stephens
Shuqi Chen
T Zhang
T Zhang
Tao Huang
Tianzhen Zhang
V Lionetti
Wangzhen Guo
WL Applequist
X Huang
X Liu
Xiaoya Chen
Xiefei Zhu
Xinghe Li
Xiongming Du
Xuehui Huang
Y Cheng
Y Hu
Y Li
Y Qin
Y Qin
Y Xiao
Yan Hu
Yangkun Wang
Z. Jeffrey Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref