Search CORE

19 research outputs found

Decision Support System Design for Informatics Student Final Projects Using C4.5 Algorithm

Author: Fatoni Hasan
Ramdhania Khairunnisa Fadhilla
Sari Rafika
Publication venue: 'Universitas Islam 45'
Publication date: 31/03/2023
Field of study

Academic consultation activities between students and academic supervisors are necessary to help students carry out academic activities. Based on the transcript of grades obtained, many students do not choose the appropriate final project/thesis specialization fields based on their academic abilities, resulting in a lot of inconsistencies between the course grades and the final project specialization fields. The purpose of this research is to minimize the subjectivity aspect of students in choosing their final project academic supervisors and minimize the inconsistencies between the course grades and the final project specialization fields. The method used in this research is classification data mining using the Decision Tree and C4.5 Algorithm methods, with the attributes involved being courses, course grades, and specialization courses. The C4.5 Decision Tree algorithm is used to transform data (tables) into a tree model and then convert the tree model into rules. The implementation of the C4.5 Decision Tree algorithm in the specialization field decision support system has been successfully carried out, with an accuracy rate of 70% from the total calculation data. The data used in this research is a sample data from several senior students in the Informatics program at Ubhara-Jaya. The results of the research decision support system can be used as a good recommendation for the Informatics program and senior students to direct their final project research. It is expected that further research will use more sample data so that the accuracy rate will be better and can be implemented in website or mobile-based applications

Publikasi Jurnal Universitas Islam 45

Prediction of Metabolic Pathways Involvement in Prokaryotic UniProtKB Data by Association Rule Mining

Author: Boudellioua Imane
Hoehndorf Robert
Martin Maria
Saidi Rabie
Solovyev Victor
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 19/09/2016
Field of study

The widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation of proteins is expected to meet the conflicting requirements of maximizing annotation coverage, while minimizing erroneous functional assignments. This trade-off imposes a great challenge in designing intelligent systems to tackle the problem of automatic protein annotation. In this work, we present a system that utilizes rule mining techniques to predict metabolic pathways in prokaryotes. The resulting knowledge represents predictive models that assign pathway involvement to UniProtKB entries. We carried out an evaluation study of our system performance using cross-validation technique. We found that it achieved very promising results in pathway identification with an F1-measure of 0.982 and an AUC of 0.987. Our prediction models were then successfully applied to 6.2 million UniProtKB/TrEMBL reference proteome entries of prokaryotes. As a result, 663,724 entries were covered, where 436,510 of them lacked any previous pathway annotations

arXiv.org e-Print Archive

Directory of Open Access Journals

Comparative analysis of machine learning algorithms used in the diagnosis of Cervical cancers

Author: Güneş Ali
Özlen Tolga
Publication venue: Afyon Kocatepe Üniversitesi
Publication date
Field of study

Serviks (Rahim Ağzı Kanseri) ölüme yol açan ve ölüm oranı en yüksek kanser türlerinden biri olarak görülmektedir. Serviks kanseri kadın kanseri arasında meme kanserinden sonra 2. Sırada yer almaktadır. Günümüzde makine öğrenmesi yöntemlerinin kullanımıyla biyomedikal veri kümelerinin analizi yaygınlaşmıştır. Özellikle kanser gibi habis hastalıkların erken teşhisinde tahminleme sistemleri önemli rol oynamaktadır. Serviks kanseri üzerinde belirlenmiş risk faktörlerine yönelik yapılan tahminler tutarlı olabilmektedir. Bu çalışmada serviks kanserinin teşhisinde kullanılan makine öğrenmesi metotlarının başarıları karşılaştırılmıştır. Çalışmada kullanılan 23 ayrı makine öğrenmesi algoritması, 838 örnek, 32 öznitelik ve 4 hedef değişkenli veri seti üzerinde test edilmiştir. Veri önişleme, özellik seçimi ve sınıflandırma olmak üzere üç aşamadan oluşan analizde sınıflandırma performansları; sınıflandırma doğruluğu, kesinlik, duyarlılık ve F-ölçütü metrikleri kullanılarak analiz edilmiştir. Analiz sonucunda RepTree algoritmasının en başarılı sonuç veren model olduğu belirlenmiştir.Cervix (Cervical Cancer) is seen as one of the cancer types that causes death and has the highest mortality rate. Cervical cancer is the second most common female cancer after breast cancer. Today, the analysis of biomedical datasets has become widespread with the use of machine learning methods. Prediction systems play an important role in the early diagnosis of malignant diseases such as cancer. Estimates of risk factors for cervical cancer can be consistent. In this study, the success of machine learning methods used in the diagnosis of cervical cancers was compared. 23 different machine learning algorithms used in the study were tested on a data set with 838 samples, 32 features and 4 target variables. Classification performances in the analysis consisting of three stages: data preprocessing, feature selection and classification; Comparisons were made using classification accuracy, precision, sensitivity, and F-criterion metrics. As a result of the analysis, it was determined that the RepTree algorithm was the model that gave the most successful results

Afyon Kocatepe Üniversitesi Açık Erişim Sistemi

Deep Sequencing of the Vaginal Microbiota of Women with HIV

Author: A Chao
Andrew D. Fernandes
BE Sha
C Farquhar
CC Wang
CJ Krebs
D Wilkie
DN Fredricks
DR Smith
E Kretschmann
GE Noether
Gregor Reid
Gregory B. Gloor
GT Spear
H Jousimies-Somer
HA David
J Oksanen
J Ravel
Jean M. Macklaim
John Changalucha
K Baisley
L Myer
MA Antonio
R Amsel
R Tamrakar
RC Edgar
RC Martinez
RE Kass
RK Colwell
RP Nugent
Ruben Hummelen
Russell J. Dickson
S Cu-Uvin
S Kullback
S Srinivasan
SC Payne
SE Msuya
SK Lai
SL Hillier
SM Huse
Stefan Bereswill
TE Taha
TJ O'Connor
X Zhou
Publication venue: Public Library of Science
Publication date: 01/08/2010
Field of study

BACKGROUND: Women living with HIV and co-infected with bacterial vaginosis (BV) are at higher risk for transmitting HIV to a partner or newborn. It is poorly understood which bacterial communities constitute BV or the normal vaginal microbiota among this population and how the microbiota associated with BV responds to antibiotic treatment. METHODS AND FINDINGS: The vaginal microbiota of 132 HIV positive Tanzanian women, including 39 who received metronidazole treatment for BV, were profiled using Illumina to sequence the V6 region of the 16S rRNA gene. Of note, Gardnerella vaginalis and Lactobacillus iners were detected in each sample constituting core members of the vaginal microbiota. Eight major clusters were detected with relatively uniform microbiota compositions. Two clusters dominated by L. iners or L. crispatus were strongly associated with a normal microbiota. The L. crispatus dominated microbiota were associated with low pH, but when L. crispatus was not present, a large fraction of L. iners was required to predict a low pH. Four clusters were strongly associated with BV, and were dominated by Prevotella bivia, Lachnospiraceae, or a mixture of different species. Metronidazole treatment reduced the microbial diversity and perturbed the BV-associated microbiota, but rarely resulted in the establishment of a lactobacilli-dominated microbiota. CONCLUSIONS: Illumina based microbial profiling enabled high though-put analyses of microbial samples at a high phylogenetic resolution. The vaginal microbiota among women living with HIV in Sub-Saharan Africa constitutes several profiles associated with a normal microbiota or BV. Recurrence of BV frequently constitutes a different BV-associated profile than before antibiotic treatment

Public Library of Science (PLOS)

Scholarship@Western

Crossref

Directory of Open Access Journals

PubMed Central

Towards a semi-automatic functional annotation tool based on decision-tree techniques

Author: Azé Jérôme
Bessières Philippe
Froidevaux Christine
Gentils Lucie
Gibrat Jean-François
Loux Valentin
Poupon Anne
Rouveirol Céline
Toffano-Nioche Claire
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Springer - Publisher Connector

PubMed Central

Towards a semi-automatic functional annotation tool based on decision-tree techniques

Author: A Bairoch
A Clare
A Clare
A Gattiker
A Vinayagam
Anne Poupon
Christine Froidevaux
Claire Toffano-Nioche
Consortium TGO
Céline Rouveirol
E Levy
EM Zdobnov
H Blockeel
H Blockeel
I Moszer
I Tetko
Jean-François Gibrat
Jérôme Azé
K Bryson
K Eilbeck
Lucie Gentils
M van de Guchte
N Cristianini
O Troyanskaya
Philippe Bessières
R Quinlan
S Chaillou
S Kiritchenko
SF Altschul
Valentin Loux
W Kreitschmann
Z Barutcuoglu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A Computational Strategy for Protein Function Assignment Which Addresses the Multidomain Problem

Author: A. J. Pérez
A. Rodríguez
Agarwal
Altschul
Altschul
Andrade
Andrade
Andrade
Apweiler
Attwood
Bachinsky
Bairoch
Bateman
Bhat
Bork
Branden
Brenner
Brenner
Burset
Corpet
des Jardins
Devos
Doolittle
Eisen
Fleischmann
Floratos
G. Thode
Gellissen
Gerlt
Gracy
Guigó
Hashimoto
Hegyi
Henikoff
Hofmann
Karp
Kretschmann
Marcotte
Murzin
Needleman
Nevill-Manning
O. Trelles
Pearson
Pearson
Pellegrini
Ponting
Rigoutsos
Rigoutsos
Rodriguez
Sander
Smith
Tamames
The Gene Ontology Consortium
The Genome International Sequencing Consortium
Thode
Thornton
Venter
Vuorio
Wilson
Yona
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2002
Field of study

A method for assigning functions to unknown sequences based on finding correlations between short signals and functional annotations in a protein database is presented. This approach is based on keyword (KW) and feature (FT) information stored in the SWISS-PROT database. The former refers to particular protein characteristics and the latter locates these characteristics at a specific sequence position. In this way, a certain keyword is only assigned to a sequence if sequence similarity is found in the position described by the FT field. Exhaustive tests performed over sequences with homologues (cluster set) and without homologues (singleton set) in the database show that assigning functions is much ’cleaner’ when information about domains (FT field) is used, than when only the keywords are used

Crossref

Directory of Open Access Journals

PubMed Central

GrAPFI: predicting enzymatic function of proteins from domain similarity graphs

Author: Aridhi Sabeur
Ritchie David
Sarker Bishnu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/04/2020
Field of study

This work is dedicated to the memory of David W. Ritchie, who recently passed away.International audienceBackground: Thanks to recent developments in genomic sequencing technologies, the number of protein sequences in public databases is growing enormously. To enrich and exploit this immensely valuable data, it is essential to annotate these sequences with functional properties such as Enzyme Commission (EC) numbers, for example. The January 2019 release of the Uniprot Knowledge base (UniprotKB) contains around 140 million protein sequences. However, only about half of a million of these (UniprotKB/SwissProt) have been reviewed and functionally annotated by expert curators using data extracted from the literature and computational analyses. To reduce the gap between the annotated and unannotated protein sequences, it is essential to develop accurate automatic protein function annotation techniques. Results: In this work, we present GrAPFI (Graph-based Automatic Protein Function Inference) for automatically annotating proteins with EC number functional descriptors from a protein domain similarity graph. We validated the performance of GrAPFI using six reference proteomes in UniprotKB/SwissProt, namely Human, Mouse, Rat, Yeast, E. Coli and Arabidopsis thaliana. We also compared GrAPFI with existing EC prediction approaches such as ECPred, DEEPre, and SVMProt. This shows that GrAPFI achieves better accuracy and comparable or better coverage with respect to these earlier approaches. Conclusions: GrAPFI is a novel protein function annotation tool that performs automatic inference on a network of proteins that are related according to their domain composition. Our evaluation of GrAPFI shows that it gives better performance than other state of the art methods. GrAPFI is available at https://gitlab.inria.fr/bsarker/bmc_grapfi.git as a stand alone tool written in Python

INRIA a CCSD electronic archive server