301 research outputs found
Bio-JOIE: Joint Representation Learning of Biological Knowledge Bases
The widespread of Coronavirus has led to a worldwide pandemic with a high
mortality rate. Currently, the knowledge accumulated from different studies
about this virus is very limited. Leveraging a wide-range of biological
knowledge, such as gene ontology and protein-protein interaction (PPI) networks
from other closely related species presents a vital approach to infer the
molecular impact of a new species. In this paper, we propose the transferred
multi-relational embedding model Bio-JOIE to capture the knowledge of gene
ontology and PPI networks, which demonstrates superb capability in modeling the
SARS-CoV-2-human protein interactions. Bio-JOIE jointly trains two model
components. The knowledge model encodes the relational facts from the protein
and GO domains into separated embedding spaces, using a hierarchy-aware
encoding technique employed for the GO terms. On top of that, the transfer
model learns a non-linear transformation to transfer the knowledge of PPIs and
gene ontology annotations across their embedding spaces. By leveraging only
structured knowledge, Bio-JOIE significantly outperforms existing
state-of-the-art methods in PPI type prediction on multiple species.
Furthermore, we also demonstrate the potential of leveraging the learned
representations on clustering proteins with enzymatic function into enzyme
commission families. Finally, we show that Bio-JOIE can accurately identify
PPIs between the SARS-CoV-2 proteins and human proteins, providing valuable
insights for advancing research on this new disease.Comment: ACM BCB 2020, Best Student Pape
Automatic information search for countering covid-19 misinformation through semantic similarity
Trabajo Fin de MĂĄster en BioinformĂĄtica y BiologĂa ComputacionalInformation quality in social media is an increasingly important issue and misinformation problem has become even more critical in the current COVID-19 pandemic, leading people exposed
to false and potentially harmful claims and rumours. Civil society organizations, such as the
World Health Organization, have demanded a global call for action to promote access to health
information and mitigate harm from health misinformation. Consequently, this project pursues
countering the spread of COVID-19 infodemic and its potential health hazards.
In this work, we give an overall view of models and methods that have been employed in the
NLP field from its foundations to the latest state-of-the-art approaches. Focusing on deep learning methods, we propose applying multilingual Transformer models based on siamese networks,
also called bi-encoders, combined with ensemble and PCA dimensionality reduction techniques.
The goal is to counter COVID-19 misinformation by analyzing the semantic similarity between
a claim and tweets from a collection gathered from official fact-checkers verified by the International Fact-Checking Network of the Poynter Institute.
It is factual that the number of Internet users increases every year and the language spoken
determines access to information online. For this reason, we give a special effort in the application of multilingual models to tackle misinformation across the globe. Regarding semantic
similarity, we firstly evaluate these multilingual ensemble models and improve the result in the
STS-Benchmark compared to monolingual and single models. Secondly, we enhance the interpretability of the modelsâ performance through the SentEval toolkit. Lastly, we compare these
modelsâ performance against biomedical models in TREC-COVID task round 1 using the BM25
Okapi ranking method as the baseline. Moreover, we are interested in understanding the ins
and outs of misinformation. For that purpose, we extend interpretability using machine learning
and deep learning approaches for sentiment analysis and topic modelling. Finally, we developed
a dashboard to ease visualization of the results.
In our view, the results obtained in this project constitute an excellent initial step toward
incorporating multilingualism and will assist researchers and people in countering COVID-19
misinformation
Learning List-Level Domain-Invariant Representations for Ranking
Domain adaptation aims to transfer the knowledge learned on (data-rich)
source domains to (low-resource) target domains, and a popular method is
invariant representation learning, which matches and aligns the data
distributions on the feature space. Although this method is studied extensively
and applied on classification and regression problems, its adoption on ranking
problems is sporadic, and the few existing implementations lack theoretical
justifications. This paper revisits invariant representation learning for
ranking. Upon reviewing prior work, we found that they implement what we call
item-level alignment, which aligns the distributions of the items being ranked
from all lists in aggregate but ignores their list structure. However, the list
structure should be leveraged, because it is intrinsic to ranking problems
where the data and the metrics are defined and computed on lists, not the items
by themselves. To close this discrepancy, we propose list-level alignment --
learning domain-invariant representations at the higher level of lists. The
benefits are twofold: it leads to the first domain adaptation generalization
bound for ranking, in turn providing theoretical support for the proposed
method, and it achieves better empirical transfer performance for unsupervised
domain adaptation on ranking tasks, including passage reranking.Comment: NeurIPS 2023. Comparison to v1: revised presentation and proof of
Corollary 4.
A Multilingual Simplified Language News Corpus
Simplified language news articles are being offered by specialized web portals in several countries. The thousands of articles that have been published over the years are a valuable resource for natural language processing, especially for efforts towards automatic text simplification. In this paper, we present SNIML, a large multilingual corpus of news in simplified language. The corpus contains 13k simplified news articles written in one of six languages: Finnish, French, Italian, Swedish, English, and German. All articles are shared under open licenses that permit academic use. The level of text simplification varies depending on the news portal. We believe that even though SNIML is not a parallel corpus, it can be useful as a complement to the more homogeneous but often smaller corpora of news in the simplified variety of one language that are currently in use
Applications of Artificial Intelligence in Battling Against Covid-19: A Literature Review
© 2020 Elsevier Ltd. All rights reserved.Colloquially known as coronavirus, the Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2), that causes CoronaVirus Disease 2019 (COVID-19), has become a matter of grave concern for every country around the world. The rapid growth of the pandemic has wreaked havoc and prompted the need for immediate reactions to curb the effects. To manage the problems, many research in a variety of area of science have started studying the issue. Artificial Intelligence is among the area of science that has found great applications in tackling the problem in many aspects. Here, we perform an overview on the applications of AI in a variety of fields including diagnosis of the disease via different types of tests and symptoms, monitoring patients, identifying severity of a patient, processing covid-19 related imaging tests, epidemiology, pharmaceutical studies, etc. The aim of this paper is to perform a comprehensive survey on the applications of AI in battling against the difficulties the outbreak has caused. Thus we cover every way that AI approaches have been employed and to cover all the research until the writing of this paper. We try organize the works in a way that overall picture is comprehensible. Such a picture, although full of details, is very helpful in understand where AI sits in current pandemonium. We also tried to conclude the paper with ideas on how the problems can be tackled in a better way and provide some suggestions for future works.Peer reviewe
Machine-assisted translation by Human-in-the-loop Crowdsourcing for Bambara
Language is more than a tool of conveying information; it is utilized in all aspects of our lives. Yet only a small number of languages in the 7,000 languages worldwide are highly resourced by human language technologies (HLT). Despite African languages representing over 2,000 languages, only a few African languages are highly resourced, for which there exists a considerable amount of parallel digital data.
We present a novel approach to machine translation (MT) for under-resourced languages by improving the quality of the model using a paradigm called ``humans in the Loop.\u27\u27
This thesis describes the work carried out to create a Bambara-French MT system including data discovery, data preparation, model hyper-parameter tuning, the development of a crowdsourcing platform for humans in the loop, vocabulary sizing, and segmentation. We present a novel approach to machine translation (MT) for under-resourced languages by improving the quality of the model using a paradigm called ``humans in the Loop.\u27\u27 We achieved a BLEU (bilingual evaluation understudy) score of 17.5. The results confirm that MT for Bambara, despite our small data set, is viable. This work has the potential to contribute to the reduction of language barriers between the people of Sub-Saharan Africa and the rest of the world
The Trials and Triumphs of the Indian Schoolteachers During the Covid -19 Pandemic
The Covid-19 pandemic had a profound consequence on the traditional system of schooling. It underwent a complete transformation because of forced isolation, social distancing and closure of all physical schools. The ecology that used to support the school, namely the government, the parents and the society underwent a tumultuous time, each facing their own set of challenges. The Government policies had to be geared to meet the requirements of the pandemic and they kept changing frequently, the parents faced the anxiety of economic crisis, pay cuts and poor cash flow; the society saw deaths in unprecedented numbers. When the eco-system that supported the school was jarred, it had its ripple effect on the education system as well. This essay is focused on the impact the pandemic had on the psychological and financial well-being of the teachers and how they coped up with the situation and showed resilience to meet the challenges of the given crisis. It has been analysed in three stages âinitial, middle and end of a year of online education as experienced by schoolteachers and Principals. It also highlights the learning and way forward from here. It has some useful suggestions for the policy makers and the school management. Keywords: Covid-19, challenges, psychological and financial impact, teacher resilience, School principals DOI: 10.7176/JEP/12-15-04 Publication date:May 31st 202
- âŠ