Search CORE

343 research outputs found

Choosing an NLP library for analyzing software documentation: a systematic literature review and a series of experiments

Author: Al Omran F.
Treude C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

To uncover interesting and actionable information from natural language documents authored by software developers, many researchers rely on "out-of-the-box" NLP libraries. However, software artifacts written in natural language are different from other textual documents due to the technical language used. In this paper, we first analyze the state of the art through a systematic literature review in which we find that only a small minority of papers justify their choice of an NLP library. We then report on a series of experiments in which we applied four state-of-the-art NLP libraries to publicly available software artifacts from three different sources. Our results show low agreement between different libraries (only between 60% and 71% of tokens were assigned the same part-of-speech tag by all four libraries) as well as differences in accuracy depending on source: For example, spaCy achieved the best accuracy on Stack Overflow data with nearly 90% of tokens tagged correctly, while it was clearly outperformed by Google's SyntaxNet when parsing GitHub ReadMe files. Our work implies that researchers should make an informed decision about the particular NLP library they choose and that customizations to libraries might be necessary to achieve good results when analyzing software artifacts written in natural language.Fouad Nasser A Al Omran, Christoph Treud

Crossref

Adelaide Research & Scholarship

Synthesis of some nucleosides derivatives from L- rhamnose with expected biological activity

Author: Amira Atef Ghoneim
CH Fenlon
E Osz
F Al-Omran
F Al-Omran
GS Besra
J Fernandez-Bolanos
JC Estevez
JR Wheatley
L Somsák
M Daffe
M Matsumoto
M McNeil
MD Smith
O Lockhoff
P Chemla
R Davis
RW Myers
VS Palekar
ZE Kandeel
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Practical procedures for production of variously blocked compounds from L-rhamnose have been developed. These compounds are highly useful as indirect β-L-rhamnosyl donors. This approach represents a new method for the synthesis of aromatic nucleoside analogues and the synthesis of (3S, 4S, 5S, 6R) 3, 4, 5-triacetoxy-2-methyl-7,9-diaza-1-oxa-spiro [4,5]decane-10-one-8-thione (7)

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Assessment of heavy metal pollution in the Great Al-Mussaib irrigation channel

Author: Al Khaddar RM
Al-Jumeily D
Al-Saati N
Al-Saati Z
Aljefery M
Hashim K
Kot P
Omran II
Ruddock F
Shaw A
Publication venue: Balaban Publishers – Desalination Publications
Publication date
Field of study

The Great Al- Mussaib Channel (GMC), in Babylon province, Iraq, has been selected as a case study to measure the concentration of nine heavy metals (Pb, Ni, Zn, Fe, Cd, Cr, Cu, Mn and Co) in both water and sediments of the GMC. The latter is used as a raw water source for two cities, which reveals the importance of the current study. Where, any heavy metals pollution could cause significant health problems for the population of these cities. The obtained results revealed that the concentrations of the studied heavy metals in the water of the GMC were less than the pollution levels and followed the order: Pb < Ni < Cu < Cr < Mn < Zn < Fe. It is noteworthy to highlight that the concentrations of Co and Cd were below the detectable limits. Additionally, the results obtained from the analyses of the studied sediment samples showed, according to the values of Pollution Load Index (PLI) and Geo-accumulation Index (Igeo), that the concentrations of studied metals were less than the pollution levels (except for a few cases) and followed the order: Cd < Co < Cu < Pb < Ni < Cr < Zn < Mn < Fe

LJMU Research Online (Liverpool John Moores University)

ENVIRONMENTAL ASSESSMENT OF AL-HILLAH RIVER POLLUTION AT BABIL GOVERNORATE (IRAQ)

Author: Bader N. Hussain
Ban Al-Hasani
Bashar F. Maaroof
Fouad F. Al-Qaim
Iacopo Carnacina
Jasim Mohammed Salman
Makki H. Omran
Mawada Abdellatif
Muhammad R. Jawad
Wiam A. Hussein
Publication venue: 'National Library of Serbia'
Publication date: 01/04/2023
Field of study

In this study, the environmental characteristics of Al-Hillah River were studied using geoinformatics applications, which is one of the geospatial techniques (GST). Applying this methodology, a geographic information system was developed, and it was supplied with laboratory data for the physical and chemical properties of 16 parameters for 2021. These data were linked to their spatial locations, using radar imagery of the Digital Elevation Model (Shuttle Radar Topography Mission), and Landsat ETM+7 satellite image. The results indicated that Al-Hillah River was affected by the liquid discharges of factories, cities, and farms spread on its sides, especially in the cities of Sadat Al-Hindiya, Al-Hillah, and Al-Hashimiyah. The seasonal changes in the climate affected some characteristics, including water temperature, pH, turbidity, total dissolved solids, and total hardness. The study showed that the concentration of sulfate (SO4) has risen above the permissible limits for the waters of Iraqi rivers. There are relatively high hardness and alkalinity values, but they were within the permissible limits. The study also showed that most of the results of environmental parameters that were used in the laboratory, were within the permissible limits of Iraqi water, except for sulfates. The justification for conducting this study is to help government agencies and decision-makers to adopt a correct vision for development projects that serve Babil Governorate. Also, it is the first time that the environmental characteristics of Al-Hillah River are studied using geoinformatics applications

Directory of Open Access Journals

Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

Over the past five decades, k-means has become the clustering algorithm of choice in many application domains primarily due to its simplicity, time/space efficiency, and invariance to the ordering of the data points. Unfortunately, the algorithm's sensitivity to the initial selection of the cluster centers remains to be its most serious drawback. Numerous initialization methods have been proposed to address this drawback. Many of these methods, however, have time complexity superlinear in the number of data points, which makes them impractical for large data sets. On the other hand, linear methods are often random and/or sensitive to the ordering of the data points. These methods are generally unreliable in that the quality of their results is unpredictable. Therefore, it is common practice to perform multiple runs of such methods and take the output of the run that produces the best results. Such a practice, however, greatly increases the computational requirements of the otherwise highly efficient k-means algorithm. In this chapter, we investigate the empirical performance of six linear, deterministic (non-random), and order-invariant k-means initialization methods on a large and diverse collection of data sets from the UCI Machine Learning Repository. The results demonstrate that two relatively unknown hierarchical initialization methods due to Su and Dy outperform the remaining four methods with respect to two objective effectiveness criteria. In addition, a recent method due to Erisoglu et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms (Springer, 2014). arXiv admin note: substantial text overlap with arXiv:1304.7465, arXiv:1209.196

arXiv.org e-Print Archive

Crossref

The prevalence of adaptive immunity to COVID-19 and reinfection after recovery - a comprehensive systematic review and meta-analysis.

Author: Al-Marwani Talal A
Al-Shebly Rafal
Chivese Tawanda
Cyprian Farhan
Doi Suhail A R
Emara Mohamed M
Furuya-Kanamori Luis
Habibullah Mohammad
Haider Mohammad Z
Hindy George
Hourani Rizeq F
Islam Nazmul
Matizanadzo Joshua T
Musa Omran A H
Nawaz Ahmed D
Shalaby Rana
Publication venue: 'Informa UK Limited'
Publication date: 31/01/2022
Field of study

This study aims to estimate the prevalence and longevity of detectable SARS-CoV-2 antibodies and T and B memory cells after recovery. In addition, the prevalence of COVID-19 reinfection and the preventive efficacy of previous infection with SARS-CoV-2 were investigated. A synthesis of existing research was conducted. The Cochrane Library, the China Academic Journals Full Text Database, PubMed, and Scopus, and preprint servers were searched for studies conducted between 1 January 2020 to 1 April 2021. Included studies were assessed for methodological quality and pooled estimates of relevant outcomes were obtained in a meta-analysis using a bias adjusted synthesis method. Proportions were synthesized with the Freeman-Tukey double arcsine transformation and binary outcomes using the odds ratio (OR). Heterogeneity was assessed using the I and Cochran's Q statistics and publication bias was assessed using Doi plots. Fifty-four studies from 18 countries, with around 12,000,000 individuals, followed up to 8 months after recovery, were included. At 6-8 months after recovery, the prevalence of SARS-CoV-2 specific immunological memory remained high; IgG - 90.4% (95%CI 72.2-99.9, I = 89.0%), CD4+ - 91.7% (95%CI 78.2-97.1y), and memory B cells 80.6% (95%CI 65.0-90.2) and the pooled prevalence of reinfection was 0.2% (95%CI 0.0-0.7, I = 98.8). Individuals previously infected with SARS-CoV-2 had an 81% reduction in odds of a reinfection (OR 0.19, 95% CI 0.1-0.3, I = 90.5%). Around 90% of recovered individuals had evidence of immunological memory to SARS-CoV-2, at 6-8 months after recovery and had a low risk of reinfection

Qatar University Institutional Repository

LSHTM Research Online

PubMed Central

Diagnostic Accuracy of the Leishmania OligoC-TesT and NASBA-Oligochromatography for Diagnosis of Leishmaniasis in Sudan

Author: A Hailu
AE Harith
Ahmed Almustafa Al-Basheer
Ahmed Haleem
Ahmed Mohamedain Eltom
Alfarazdeg A. Saad
Awad Hamad
CM Mugasa
CM Mugasa
EE Zijlstra
EE Zijlstra
F Chappuis
F Chappuis
Gerard J. Schoone
H Veeken
Henk D. Schallig
HW Murray
JR Landis
K Mullis
K Ritmeijer
K Ritmeijer
L De Almeida Silva
M Boelaert
M Gari-Toussaint
Mustafa I. Elbashir
NR Bhattarai
Nuha G. Ahmed
Omran F. Osman
Osman S. Osman
Paul Andrew Bates
Philippe Büscher
R Boom
R ter Horst
S Deborggraeve
S Deborggraeve
Sayda El-Safi
Stijn Deborggraeve
Thierry Laurent
WF Van der Meide
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The leishmaniases are a group of vector-borne diseases caused by protozoan parasites of the genus Leishmania. The parasites are transmitted by phlebotomine sand flies and can cause, depending on the infecting species, three clinical manifestations of leishmaniasis: visceral leishmaniasis (VL), post kala-azar dermal leishmaniasis (PKDL) and cutaneous leishmaniasis (CL) including the mucocutaneous form. VL, PKDL as well as CL are endemic in several parts of Sudan, and VL especially represents a major health problem in this country. Molecular tests such as the polymerase chain reaction (PCR) or nucleic acid sequence based assay (NASBA) are powerful techniques for accurate detection of the parasite in clinical specimens, but broad use is hampered by their complexity and lack of standardisation. Recently, the Leishmania OligoC-TesT and NASBA-Oligochromatography were developed as simplified and standardised PCR and NASBA formats. In this study, both tests were phase II evaluated for diagnosis of VL, PKDL and CL in Sudan

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Tropmed Central Antwerp

Expanding the genetic heterogeneity of intellectual disability

Author: Abdulwahab FM
Abouelhoda M
Al Kindy A
Al Murshedi F
Al Musafri F
Al Tala S
Al Tassan N
Alasmari A
Aldhalaan H
Alfadhel M
Alhashem A
Alhashmi N
Ali R
Alkuraya FS
Alkuraya H
Alsahli S
Alsaman A
Alshaer A
AlZahrani F
Anazi S
Arold ST
Asi YT
Banu S
Ben-Omran T
Bupp C
El-Hattab AW
Faqeih E
Hashem M
Houlden H
Ibrahim N
Kurdi W
Lashley T
Maddirevula S
Monies D
Patel N
Rumayyan A
Saleh MM
Salih MA
Salpietro V
Shamseldin HE
Suleiman J
Sultan T
Tabarki B
Publication venue
Publication date: 01/11/2017
Field of study

Intellectual disability (ID) is a common morbid condition with a wide range of etiologies. The list of monogenic forms of ID has increased rapidly in recent years thanks to the implementation of genomic sequencing techniques. In this study, we describe the phenotypic and genetic findings of 68 families (105 patients) all with novel ID-related variants. In addition to established ID genes, including ones for which we describe unusual mutational mechanism, some of these variants represent the first confirmatory disease-gene links following previous reports (TRAK1, GTF3C3, SPTBN4 and NKX6-2), some of which were based on single families. Furthermore, we describe novel variants in 14 genes that we propose as novel candidates (ANKHD1, ASTN2, ATP13A1, FMO4, MADD, MFSD11, NCKAP1, NFASC, PCDHGA10, PPP1R21, SLC12A2, SLK, STK32C and ZFAT). We highlight MADD and PCDHGA10 as particularly compelling candidates in which we identified biallelic likely deleterious variants in two independent ID families each. We also highlight NCKAP1 as another compelling candidate in a large family with autosomal dominant mild intellectual disability that fully segregates with a heterozygous truncating variant. The candidacy of NCKAP1 is further supported by its biological function, and our demonstration of relevant expression in human brain. Our study expands the locus and allelic heterogeneity of ID and demonstrates the power of positional mapping to reveal unusual mutational mechanisms

UCL Discovery

Characterizing the morbid genome of ciliopathies

Author: A Poretti
AM Fry
AM Waters
Amal Al Hashem
Anas M. Alazami
Anason Halees
Andre Megarbane
Basudha Basu
Brahim Tabarki
C Bergmann
C Knopp
C Wright
CJ Westlake
Clare V. Logan
Colin A. Johnson
DA Parfitt
David A. Parry
Dorota Monies
DU Mick
Eissa Faqeih
F Tissir
Firdous M. Abdulwahab
Fowzan S. Alkuraya
FS Alkuraya
FS Alkuraya
FS Alkuraya
G Wheway
Hadeel Alsharif
Heba Morsy
Hisham Alkuraya
K Szymanska
Karsten Boldt
Katarzyna Szymanska
KB Schou
KL Coene
L Abu-Safieh
L Baala
L Sang
M Toriyama
MA Parisi
Maha Alnemer
Mais Hashem
Marius Ueffing
MG Saudi
MH Farkas
Mohamed Abouelhoda
Mohamed El-Kalioby
Mohammed A. Aldahmesh
Mohammed Al-Owain
Mohammed Zain Seidahmed
MS Zaki
Muneera Al-Husain
Mustafa A. Salih
MV Nachury
Nada Al Tassan
Nada Derar
Neama Meriki
Niema Ibrahim
Nisha Patel
Nour Ewida
P Beales
R Rao Damerla
R Roepman
R Shaheen
Ranad Shaheen
Rawda Sonbul
S Ramsbottom
Saad AlShahwan
Saeed Al Tala
SM Ware
SP Choksi
T Ben‐Omran
Tariq Faquih
TJ Dam van
Y Muto
ZA Abdelhamed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background Ciliopathies are clinically diverse disorders of the primary cilium. Remarkable progress has been made in understanding the molecular basis of these genetically heterogeneous conditions; however, our knowledge of their morbid genome, pleiotropy, and variable expressivity remains incomplete. Results We applied genomic approaches on a large patient cohort of 371 affected individuals from 265 families, with phenotypes that span the entire ciliopathy spectrum. Likely causal mutations in previously described ciliopathy genes were identified in 85% (225/265) of the families, adding 32 novel alleles. Consistent with a fully penetrant model for these genes, we found no significant difference in their “mutation load” beyond the causal variants between our ciliopathy cohort and a control non-ciliopathy cohort. Genomic analysis of our cohort further identified mutations in a novel morbid gene TXNDC15, encoding a thiol isomerase, based on independent loss of function mutations in individuals with a consistent ciliopathy phenotype (Meckel-Gruber syndrome) and a functional effect of its deficiency on ciliary signaling. Our study also highlighted seven novel candidate genes (TRAPPC3, EXOC3L2, FAM98C, C17orf61, LRRCC1, NEK4, and CELSR2) some of which have established links to ciliogenesis. Finally, we show that the morbid genome of ciliopathies encompasses many founder mutations, the combined carrier frequency of which accounts for a high disease burden in the study population. Conclusions Our study increases our understanding of the morbid genome of ciliopathies. We also provide the strongest evidence, to date, in support of the classical Mendelian inheritance of Bardet-Biedl syndrome and other ciliopathies

Crossref

Springer - Publisher Connector

PubMed Central

White Rose Research Online

Three new cases of late-onset cblC defect and review of the literature illustrating when to consider inborn errors of metabolism beyond infancy

Author: ACH Tsai
AH Bouts
AL Boxer
B Debreceni
B Fowler
B Fowler
Brian Fowler
C Nogueira
C Thauvin-Robinet
C Zhang
CF1 Morel
CM Taylor
D Martinelli
Daniela Karall
DS Rosenblatt
E Cornec-Le Gall
E Richard
E Roze
F Sedel
F Wang
HJ Lin
Ilse Kern
JD Weisfeld-Adams
JLK Van Hove
JM Powers
JP Lerner-Ellis
JP Lerner-Ellis
Karine Hadaya
Klaus Seppi
M Kilic
M Kömhoff
Martina Huemer
Matthias R Baumgartner
N Iqbal
OA Bodamer
OA Bodamer
P Augoustides-Savvopoulou
PH Backe
R Gold
R Surtees
Ronny Beer
S Fischer
S Kölker
S Shinnar
Sabine Scholl-Bürgi
SI Goodman
SM Brunelli
TI Ben-Omran
V Guigonis
W Herrmann
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref