45 research outputs found

    Finding Similar Documents Using Different Clustering Techniques

    Get PDF
    AbstractText clustering is an important application of data mining. It is concerned with grouping similar text documents together. In this paper, several models are built to cluster capstone project documents using three clustering techniques: k-means, k-means fast, and k-medoids. Our datatset is obtained from the library of the College of Computer and Information Sciences, King Saud University, Riyadh. Three similarity measure are tested: cosine similarity, Jaccard similarity, and Correlation Coefficient. The quality of the obtained models is evaluated and compared. The results indicate that the best performance is achieved using k-means and k-medoids combined with cosine similarity. We observe variation in the quality of clustering based on the evaluation measure used. In addition, as the value of k increases, the quality of the resulting cluster improves. Finally, we reveal the categories of graduation projects offered in the Information Technology department for female students

    Regulatory immune cytokines in RSV infection

    Get PDF
    Antibody production in the lungs is an essential defence mechanism against respiratory pathogens. However, little is known about the local activation of B cells in the lung.The production of BAFF and APRIL by airway epithelial cells could contribute to local recruitment, activation, class switch recombination and antibody production by B cells in the lung. In vitro, BEAS-2B cells were used to characterize BAFF and APRIL production simulated either by RSV infection or addition of cytokines. RSV and IFN-β significantly induced expression of BAFF mRNA and protein but not APRIL. BAFF mRNA reached significantly high levels at 12h and declined at 48h after either RSV infection or IFN-β stimulation. Western blot analysis of resting epithelial cells showed that membrane BAFF was expressed by resting cells. On RSV infection or IFN-β stimulation, expression of membrane BAFF increased at 12 and 24hours and disappeared at 48h, which suggests soluble BAFF was cleaved from the membrane and released into the culture supernatant by 48h, where it was measured by ELISA. When BEAS-2B cells were infected with RSV after pre-incubation with anti-IFN β, expression of BAFF was blocked, which indicates that airway epithelial cells can produce BAFF in an interferon dependent manner. BEAS-2B cells did not express CXCL12, CXL13, CCL19 or CCL21, which indicates there are other potential sources that express these chemokines during RSV infection rather than the airway epithelium. A murine model of RSV infection was used to examine expression of BAFF, APRIL and of the chemokines CXCL12, CXCL13, CCL19 and CCL21. Cytokine mRNA and RSV N gene expression were measured by Taqman PCR in lung tissue from control mice at day 0 and mice challenged with RSV (A2 strain) or control UV-treated RSV at days 1, 2, 4, 7, 8, 10, 14 and 21 days after RSV infection by ELISA. RSV N RNA was significantly detected at day 1, 2 and 4 after RSV infection compared to UV RSV control. BAFF mRNA expression was increased significantly after RSV infection on day 1 ,7 and 8 in comparison to UV treated RSV control at the same time points. Equally, BAFF protein was also elevated significantly after RSV infection at days 1, 2, 4, 7 and 8 in comparison to UV- RSV control at the same time points. CXCL13 mRNA expression was increased significantly after RSV infection on day 1 and 7 in comparison to UV-RSV control at the same time points. Moreover, CXCL13 protein was increased significantly after RSV infection at day 1, 2 and 7 in comparison to UV RSV control at the same time points. CXCL12, CCL19 and CCL21 mRNA and protein levels were not increased significantly after RSV infection, which may indicate they are not active during RSV infection. Examination of mouse lung sections showed strong positive staining of B cells (CD20) following RSV infection at day 1, 2, 4, 7 and 8 and FACS analysis B cells numbers were increased significantly at day 6 and 8 following RSV infection relative to UV-RSV control. RSV infection results in up-regulated BAFF and CXCL13 expression, consistent with a role for CXCL13 in recruiting B cells and BAFF in promoting airway B cell survival or differentiation. Collectively, these results suggest that the airway epithelial could help recruit and support B cell growth and development and Ab production in the lung

    Road safety evaluation through automatic extraction of road horizontal alignments from Mobile LiDAR System and inductive reasoning based on a decision tree

    Get PDF
    13 p.Safe roads are a necessity for any society because of the high social costs of traffic accidents. This challenge is addressed by a novel methodology that allows us to evaluate road safety from Mobile LiDAR System data, taking advantage of the road alignment due to its influence on the accident rate. Automation is obtained through an inductive reasoning process based on a decision tree that provides a potential risk assessment. To achieve this, a 3D point cloud is classified by an iterative and incremental algorithm based on a 2.5D and 3D Delaunay triangulation, which apply different algorithms sequentially. Next, an automatic extraction process of road horizontal alignment parameters is developed to obtain geometric consistency indexes, based on a joint triple stability criterion. Likewise, this work aims to provide a powerful and effective preventive and/or predictive tool for road safety inspections. The proposed methodology was implemented on three stretches of Spanish roads, each with different traffic conditions that represent the most common road types. The developed methodology was successfully validated through as-built road projects, which were considered as “ground truth.”S

    Symptoms and management of cow's milk allergy: perception and evidence

    Get PDF
    IntroductionThe diagnosis and management of cow's milk allergy (CMA) is a topic of debate and controversy. Our aim was to compare the opinions of expert groups from the Middle East (n = 14) and the European Society of Paediatric Gastroenterology, Hepatology and Nutrition (ESPGHAN) (n = 13).MethodsThese Expert groups voted on statements that were developed by the ESPGHAN group and published in a recent position paper. The voting outcome was compared.ResultsOverall, there was consensus amongst both groups of experts. Experts agreed that symptoms of crying, irritability and colic, as single manifestation, are not suggestive of CMA. They agreed that amino-acid based formula (AAF) should be reserved for severe cases (e.g., malnutrition and anaphylaxis) and that there is insufficient evidence to recommend a step-down approach. There was no unanimous consensus on the statement that a cow's milk based extensively hydrolysed formula (eHF) should be the first choice as a diagnostic elimination diet in mild/moderate cases. Although the statements regarding the role for hydrolysed rice formula as a diagnostic and therapeutic elimination diet were accepted, 3/27 disagreed. The votes regarding soy formula highlight the differences in opinion in the role of soy protein in CMA dietary treatment. Generally, soy-based formula is seldom available in the Middle-East region. All ESPGHAN experts agreed that there is insufficient evidence that the addition of probiotics, prebiotics and synbiotics increase the efficacy of elimination diets regarding CMA symptoms (despite other benefits such as decrease of infections and antibiotic intake), whereas 3/14 of the Middle East group thought there was sufficient evidence.DiscussionDifferences in voting are related to geographical, cultural and other conditions, such as cost and availability. This emphasizes the need to develop region-specific guidelines considering social and cultural conditions, and to perform further research in this area

    Machine learning approaches in COVID-19 diagnosis, mortality, and severity risk prediction: A review

    No full text
    The existence of widespread COVID-19 infections has prompted worldwide efforts to control and manage the virus, and hopefully curb it completely. One important line of research is the use of machine learning (ML) to understand and fight COVID-19. This is currently an active research field. Although there are already many surveys in the literature, there is a need to keep up with the rapidly growing number of publications on COVID-19-related applications of ML. This paper presents a review of recent reports on ML algorithms used in relation to COVID-19. We focus on the potential of ML for two main applications: diagnosis of COVID-19 and prediction of mortality risk and severity, using readily available clinical and laboratory data. Aspects related to algorithm types, training data sets, and feature selection are discussed. As we cover work published between January 2020 and January 2021, a few key points have come to light. The bulk of the machine learning algorithms used in these two applications are supervised learning algorithms. The established models are yet to be used in real-world implementations, and much of the associated research is experimental. The diagnostic and prognostic features discovered by ML models are consistent with results presented in the medical literature. A limitation of the existing applications is the use of imbalanced data sets that are prone to selection bias

    Incremental Ant-Miner Classifier for Online Big Data Analytics

    No full text
    Internet of Things (IoT) environments produce large amounts of data that are challenging to analyze. The most challenging aspect is reducing the quantity of consumed resources and time required to retrain a machine learning model as new data records arrive. Therefore, for big data analytics in IoT environments where datasets are highly dynamic, evolving over time, it is highly advised to adopt an online (also called incremental) machine learning model that can analyze incoming data instantaneously, rather than an offline model (also called static), that should be retrained on the entire dataset as new records arrive. The main contribution of this paper is to introduce the Incremental Ant-Miner (IAM), a machine learning algorithm for online prediction based on one of the most well-established machine learning algorithms, Ant-Miner. IAM classifier tackles the challenge of reducing the time and space overheads associated with the classic offline classifiers, when used for online prediction. IAM can be exploited in managing dynamic environments to ensure timely and space-efficient prediction, achieving high accuracy, precision, recall, and F-measure scores. To show its effectiveness, the proposed IAM was run on six different datasets from different domains, namely horse colic, credit cards, flags, ionosphere, and two breast cancer datasets. The performance of the proposed model was compared to ten state-of-the-art classifiers: naive Bayes, logistic regression, multilayer perceptron, support vector machine, K*, adaptive boosting (AdaBoost), bagging, Projective Adaptive Resonance Theory (PART), decision tree (C4.5), and random forest. The experimental results illustrate the superiority of IAM as it outperformed all the benchmarks in nearly all performance measures. Additionally, IAM only needs to be rerun on the new data increment rather than the entire big dataset on the arrival of new data records, which makes IAM better in time- and resource-saving. These results demonstrate the strong potential and efficiency of the IAM classifier for big data analytics in various areas

    Towards Accurate Children’s Arabic Handwriting Recognition via Deep Learning

    No full text
    Automatic handwriting recognition has received considerable attention over the past three decades. Handwriting recognition systems are useful for a wide range of applications. Much research has been conducted to address the problem in Latin languages. However, less research has focused on the Arabic language, especially concerning recognizing children’s Arabic handwriting. This task is essential as the demand for educational applications to practice writing and spelling Arabic letters is increasing. Thus, the development of Arabic handwriting recognition systems and applications for children is important. In this paper, we propose two deep learning-based models for the recognition of children’s Arabic handwriting. The proposed models, a convolutional neural network (CNN) and a pre-trained CNN (VGG-16) were trained using Hijja, a recent dataset of Arabic children’s handwriting collected in Saudi Arabia. We also train and test our proposed models using the Arabic Handwritten Character Dataset (AHCD). We compare the performance of the proposed models with similar models from the literature. The results indicate that our proposed CNN outperforms the pre-trained CNN (VGG-16) and the other compared models from the literature. Moreover, we developed Mutqin, a prototype to help children practice Arabic handwriting. The prototype was evaluated by target users, and the results are reported

    Early Detection of Red Palm Weevil, Rhynchophorus ferrugineus (Olivier), Infestation Using Data Mining

    No full text
    In the past 30 years, the red palm weevil (RPW), Rhynchophorus ferrugineus (Olivier), a pest that is highly destructive to all types of palms, has rapidly spread worldwide. However, detecting infestation with the RPW is highly challenging because symptoms are not visible until the death of the palm tree is inevitable. In addition, the use of automated RPW weevil identification tools to predict infestation is complicated by a lack of RPW datasets. In this study, we assessed the capability of 10 state-of-the-art data mining classification algorithms, Naive Bayes (NB), KSTAR, AdaBoost, bagging, PART, J48 Decision tree, multilayer perceptron (MLP), support vector machine (SVM), random forest, and logistic regression, to use plant-size and temperature measurements collected from individual trees to predict RPW infestation in its early stages before significant damage is caused to the tree. The performance of the classification algorithms was evaluated in terms of accuracy, precision, recall, and F-measure using a real RPW dataset. The experimental results showed that infestations with RPW can be predicted with an accuracy up to 93%, precision above 87%, recall equals 100%, and F-measure greater than 93% using data mining. Additionally, we found that temperature and circumference are the most important features for predicting RPW infestation. However, we strongly call for collecting and aggregating more RPW datasets to run more experiments to validate these results and provide more conclusive findings

    Role of Optimization in RNA–Protein-Binding Prediction

    No full text
    RNA-binding proteins (RBPs) play an important role in regulating biological processes, such as gene regulation. Understanding their behaviors, for example, their binding site, can be helpful in understanding RBP-related diseases. Studies have focused on predicting RNA binding by means of machine learning algorithms including deep convolutional neural network models. One of the integral parts of modeling deep learning is achieving optimal hyperparameter tuning and minimizing a loss function using optimization algorithms. In this paper, we investigate the role of optimization in the RBP classification problem using the CLIP-Seq 21 dataset. Three optimization methods are employed on the RNA–protein binding CNN prediction model; namely, grid search, random search, and Bayesian optimizer. The empirical results show an AUC of 94.42%, 93.78%, 93.23% and 92.68% on the ELAVL1C, ELAVL1B, ELAVL1A, and HNRNPC datasets, respectively, and a mean AUC of 85.30 on 24 datasets. This paper’s findings provide evidence on the role of optimizers in improving the performance of RNA–protein binding prediction
    corecore