110 research outputs found

    AraNLP: A Java-based library for the processing of Arabic text

    Get PDF
    We present a free, Java-based library named "AraNLP" that covers various Arabic text preprocessing tools. Although a good number of tools for processing Arabic text already exist, integration and compatibility problems continually occur. AraNLP is an attempt to gather most of the vital Arabic text preprocessing tools into one library that can be accessed easily by integrating or accurately adapting existing tools and by developing new ones when required. The library includes a sentence detector, tokenizer, light stemmer, root stemmer, part-of-speech tagger (POS-tagger), word segmenter, normalizer, and a punctuation and diacritic remover

    A semi-supervised learning approach to arabic named entity recognition

    Get PDF
    We present ASemiNER, a semi-supervised algorithm for identifying Named Entities (NEs) in Arabic text. ASemiNER does not require annotated training data, or gazetteers. It also can be easily adapted to handle more than the three standard NE types (Person, Location, and Organisation). To our knowledge, our algorithm is the first study that intensively investigates the semi-supervised pattern-based learning approach to Arabic Named Entity Recognition (NER). We describe ASemiNER and compare its performance with different supervised systems. We evaluate this algorithm by way of experiments to extract the three standard named-entity types. Ultimately, our algorithm outperforms simple supervised systems and also performs well when we evaluate its performance in order to extract three new, specialised types of NEs (Politicians, Sportspersons, and Artists)

    Automatic Creation of Arabic Named Entity Annotated Corpus Using Wikipedia

    Get PDF
    In this paper we propose a new methodology to exploit Wikipedia features and structure to automatically develop an Arabic NE annotated corpus. Each Wikipedia link is transformed into an NE type of the target article in order to produce the NE annotation. Other Wikipedia features - namely redirects, anchor texts, and inter-language links - are used to tag additional NEs, which appear without links in Wikipedia texts. Furthermore, we have developed a filtering algorithm to eliminate ambiguity when tagging candidate NEs. Herein we also introduce a mechanism based on the high coverage of Wikipedia in order to address two challenges particular to tagging NEs in Arabic text: rich morphology and the absence of capitalisation. The corpus created with our new method (WDC) has been used to train an NE tagger which has been tested on different domains. Judging by the results, an NE tagger trained on WDC can compete with those trained on manually annotated corpora

    Hafnia and alumina on sulphur passivated germanium

    Get PDF
    In this work hafnia (HfO2) and alumina (Al2O3) films were deposited on germanium, using either water or oxygen plasma as the oxidant, by atomic layer deposition at 250 °C with and without sulphur passivation of the substrate. X-ray photoelectron spectroscopy was carried out to investigate the interface between both HfO2 and Al2O3 films and germanium. The results show that for hafnia and alumina deposited with water on pre-sulphur treated germanium there is negligible GeOx formation when compared to films grown using oxygen plasma. The results support the case for sulphur passivation of the interface

    Band alignments at Ga<sub>2</sub>O<sub>3</sub> heterojunction interfaces with Si and Ge

    Get PDF
    Amorphous Ga2O3 thin films were deposited on p-type (111) and (100) surfaces of silicon and (100) germanium by atomic layer deposition (ALD). X-ray photoelectron spectroscopy (XPS) was used to investigate the band alignments at the interfaces using the Kraut Method. The valence band offsets were determined to be 3.49± 0.08 eV and 3.47± 0.08 eV with Si(111) and Si(100) respectively and 3.51eV± 0.08 eV with Ge(100). Inverse photoemission spectroscopy (IPES) was used to investigate the conduction band of a thick Ga2O3 film and the band gap of the film was determined to be 4.63±0.14 eV. The conduction band offsets were found to be 0.03 eV and 0.05eV with Si(111) and Si(100) respectively, and 0.45eV with Ge(100). The results indicate that the heterojunctions of Ga2O3 with Si(100), Si(111) and Ge(100) are all type I heterojunctions

    ViT-DeiT: An Ensemble Model for Breast Cancer Histopathological Images Classification

    Full text link
    Breast cancer is the most common cancer in the world and the second most common type of cancer that causes death in women. The timely and accurate diagnosis of breast cancer using histopathological images is crucial for patient care and treatment. Pathologists can make more accurate diagnoses with the help of a novel approach based on image processing. This approach is an ensemble model of two types of pre-trained vision transformer models, namely, Vision Transformer and Data-Efficient Image Transformer. The proposed ensemble model classifies breast cancer histopathology images into eight classes, four of which are categorized as benign, whereas the others are categorized as malignant. A public dataset was used to evaluate the proposed model. The experimental results showed 98.17% accuracy, 98.18% precision, 98.08% recall, and a 98.12% F1 score.Comment: 7 pages, 10 figures, 7 table

    Ge interface engineering using ultra-thin La2O3 and Y2O3 films: A study into the effect of deposition temperature

    Get PDF
    A study into the optimal deposition temperature for ultra-thin La2O3/Ge and Y2O3/Ge gate stacks has been conducted in this paper with the aim to tailor the interfacial layer for effective passivation of the Ge interface. A detailed comparison between the two lanthanide oxides (La2O3 and Y2O3) in terms of band line-up, interfacial features, and reactivity to Ge using medium energy ion scattering, vacuum ultra-violet variable angle spectroscopic ellipsometry (VUV-VASE), X-ray photoelectron spectroscopy, and X-ray diffraction is shown. La2O3 has been found to be more reactive to Ge than Y2O3, forming LaGeOx and a Ge sub-oxide at the interface for all deposition temperature studied, in the range from 44 °C to 400 °C. In contrast, Y2O3/Ge deposited at 400 °C allows for an ultra-thin GeO2 layer at the interface, which can be eliminated during annealing at temperatures higher than 525 °C leaving a pristine YGeOx/Ge interface. The Y2O3/Ge gate stack deposited at lower temperature shows a sub-band gap absorption feature fitted to an Urbach tail of energy 1.1 eV. The latter correlates to a sub-stoichiometric germanium oxide layer at the interface. The optical band gap for the Y2O3/Ge stacks has been estimated to be 5.7 ± 0.1 eV from Tauc-Lorentz modelling of VUV-VASE experimental data. For the optimal deposition temperature (400 °C), the Y2O3/Ge stack exhibits a higher conduction band offset (>2.3 eV) than the La2O3/Ge (∼2 eV), has a larger band gap (by about 0.3 eV), a germanium sub-oxide free interface, and leakage current (∼10−7 A/cm2 at 1 V) five orders of magnitude lower than the respective La2O3/Ge stack. Our study strongly points to the superiority of the Y2O3/Ge system for germanium interface engineering to achieve high performance Ge Complementary Metal Oxide Semiconductor technology

    Alcohol Consumption Impairs the Ependymal Cilia Motility in the Brain Ventricles

    Get PDF
    Ependymal cilia protrude into the central canal of the brain ventricles and spinal cord to circulate the cerebral spinal fluid (CSF). Ependymal cilia dysfunction can hinder the movement of CSF leading to an abnormal accumulation of CSF within the brain known as hydrocephalus. Although the etiology of hydrocephalus was studied before, the effects of ethanol ingestion on ependymal cilia function have not been investigated in vivo. Here, we report three distinct types of ependymal cilia, type-I, type-II and type-III classified based upon their beating frequency, their beating angle, and their distinct localization within the mouse brain-lateral ventricle. Our studies show for the first time that oral gavage of ethanol decreased the beating frequency of all three types of ependymal cilia in both the third and the lateral rat brain ventricles in vivo. Furthermore, we show for the first time that hydin, a hydrocephalus-inducing gene product whose mutation impairs ciliary motility, and polycystin-2, whose ablation is associated with hydrocephalus are colocalized to the ependymal cilia. Thus, our studies reinforce the presence of three types of ependymal cilia in the brain ventricles and demonstrate the involvement of ethanol as a risk factor for the impairment of ependymal cilia motility in the brain

    Separation of Isomeric Forms of Urolithin Glucuronides Using Supercritical Fluid Chromatography

    Get PDF
    Producción CientíficaUrolithins are gut microbiota metabolites produced in humans after consuming foods containing ellagitannins and ellagic acid. Three urolithin metabotypes have been reported for different individuals depending on the final urolithins produced. After absorption, they are conjugated with glucuronic acid (phase II metabolism), and these are the main circulating metabolites in plasma and reach different tissues. Different regioisomeric isomers of urolithin glucuronides have been described. Still, their identification and quantification in humans have not been properly reported due to resolution limitations in their analysis by reversed-phase high-performance liquid chromatography. In the present study, we report a novel method for separating these isomers using supercritical fluid chromatography. With this method, urolithin A 3- and 8-glucuronide, isourolithin A 3- and 9- glucuronide, and urolithin B 3-glucuronide (8-hydroxy urolithin 3-glucuronide; 3-hydroxy urolithin 8-glucuronide; 3-hydroxyurolithin 9-glucuronide; 9-hydroxyurolithin 3-glucuronide; and urolithin 3-glucuronide) were separated in less than 15 min. The proposed method was applied to successfully analyze these metabolites in urine samples from different volunteers belonging to different metabotypes.Taif University (TURSP- HC2021/3

    Developing Electron Microscopy Tools for Profiling Plasma Lipoproteins Using Methyl Cellulose Embedment, Machine Learning and Immunodetection of Apolipoprotein B and Apolipoprotein(a)

    Get PDF
    Plasma lipoproteins are important carriers of cholesterol and have been linked strongly to cardiovascular disease (CVD). Our study aimed to achieve fine-grained measurements of lipoprotein subpopulations such as low-density lipoprotein (LDL), lipoprotein(a) (Lp(a), or remnant lipoproteins (RLP) using electron microscopy combined with machine learning tools from microliter samples of human plasma. In the reported method, lipoproteins were absorbed onto electron microscopy (EM) support films from diluted plasma and embedded in thin films of methyl cellulose (MC) containing mixed metal stains, providing intense edge contrast. The results show that LPs have a continuous frequency distribution of sizes, extending from LDL (> 15 nm) to intermediate density lipoprotein (IDL) and very low-density lipoproteins (VLDL). Furthermore, mixed metal staining produces striking “positive” contrast of specific antibodies attached to lipoproteins providing quantitative data on apolipoprotein(a)-positive Lp(a) or apolipoprotein B (ApoB)-positive particles. To enable automatic particle characterization, we also demonstrated efficient segmentation of lipoprotein particles using deep learning software characterized by a Mask Region-based Convolutional Neural Networks (R-CNN) architecture with transfer learning. In future, EM and machine learning could be combined with microarray deposition and automated imaging for higher throughput quantitation of lipoproteins associated with CVD risk.Publisher PDFPeer reviewe
    corecore