14 research outputs found

    Drug/nondrug classification using Support Vector Machines with various feature selection strategies

    No full text
    In conjunction with the advance in computer technology, virtual screening of small molecules has been started to use in drug discovery. Since there are thousands of compounds in early-phase of drug discovery, a fast classification method, which can distinguish between active and inactive molecules, can be used for screening large compound collections. In this study, we used Support Vector Machines (SVM) for this type of classification task. SVM is a powerful classification tool that is becoming increasingly popular in various machine-learning applications. The data sets consist of 631 compounds for training set and 216 compounds for a separate test set. In data pre-processing step, the Pearson's correlation coefficient used as a filter to eliminate redundant features. After application of the correlation filter, a single SVM has been applied to this reduced data set. Moreover, we have investigated the performance of SVM with different feature selection strategies, including SVM-Recursive Feature Elimination, Wrapper Method and Subset Selection. All feature selection methods generally represent better performance than a single SVM while Subset Selection outperforms other feature selection methods. We have tested SVM as a classification tool in a real-life drug discovery problem and our results revealed that it could be a useful method for classification task in early-phase of drug discovery. (C) 2014 Elsevier Ireland Ltd. All rights reserved

    MVN: An R Package for Assessing Multivariate Normality

    No full text
    Assessing the assumption of multivariate normality is required by many parametric multivariate statistical methods, such as MANOVA, linear discriminant analysis, principal component analysis, canonical correlation, etc. It is important to assess multivariate normality in order to proceed with such statistical methods. There are many analytical methods proposed for checking multivariate normality. However, deciding which method to use is a challenging process, since each method may give different results under certain conditions. Hence, we may say that there is no best method, which is valid under any condition, for normality checking. In addition to numerical results, it is very useful to use graphical methods to decide on multivariate normality. Combining the numerical results from several methods with graphical approaches can be useful and provide more reliable decisions. Here, we present an R package, MVN, to assess multivariate normality. It contains the three most widely used multivariate normality tests, including Mardia's, Henze-Zirkler's and Royston's, and graphical approaches, including chi-square Q-Q, perspective and contour plots. It also includes two multivariate outlier detection methods, which are based on robust Mahalanobis distances. Moreover, this package offers functions to check the univariate normality of marginal distributions through both tests and plots. Furthermore, especially for non-R users, we provide a user-friendly web application of the package. This application is available at http://www.biosoft.hacettepe.edu.tr/MVN/

    Association between Dietary Glycaemic Index and Glycaemic Load and Adiposity Indices in Polycystic Ovary Syndrome

    No full text
    Objective: Obesity is a key contributor to metabolic and reproductive outcomes in polycystic ovary syndrome (PCOS). The role of the dietary glycemic index (GI) and load (GL), and adiposity has been debated and studies on PCOS are scarce. We aimed to compare the dietary GI and GL and several anthropometric measurements in PCOS and control women. The association between dietary GI and GL and adiposity indices was examined in this cross-sectional study. Methods and materials: A study population consisted of 65 previously diagnosed with PCOS and 65 healthy women. All participants underwent detailed anthropometric, dietary and physical activity evaluation and categorized based on GI and GL tertiles. Results: When dietary GL was adjusted for age, physical activity level (PAL), and duration of diagnosis, there was a statistically significant inverse association between dietary GL and waist/hip ratio (WHR) (OR: 0.136; 95% CI: 0.021-0.874; p = 0.036) in women with PCOS. Both dietary GI (OR: 8.869; 95% CI: 1.194-65.910; p = 0.033 for tertile 2 in adjustment model) and GL (OR: 7.200; 95% CI: 1.635-31.712; p = 0.009 for tertile 3 in crude model; OR: 5.801; 95% CI: 1.242-27.096; p = 0.025 for tertile 3 in adjustment model) positively associated with WHR in healthy subjects. Also, positive association was observed between dietary GI and waist/height ratio (WHtR) (OR: 0.229; 95% CI: 0.063-0.826; p = 0.024 for tertile 2; OR: 0.277; 95% CI: 0.078-0.988; p = 0.048 for tertile 3) in healthy controls, however after adjustment for age and PAL, statistical significance was lost (OR: 1.051; 95% CI: 0.152-7.261; p = 0.959 for tertile 2; OR: 1.522; 95% CI: 0.225-10.297; p = 0.667 for tertile 3). Conclusion: The results of this study consistent with the literature that PCOS is associated with increased adiposity indices. There was no association between dietary GI/GL and BMI, WC, WHtR, and ABSI but dietary GL was inversely associated WHR in PCOS patients

    Applicability of Demirjian's four methods and Willems method for age estimation in a sample of Turkish children

    No full text
    The aim of this study was to evaluate applicability of five dental methods including Demirjian's original, revised, four teeth, and alternate four teeth methods and Willems method for age estimation in a sample of Turkish children. Panoramic radiographs of 799 children (412 females, 387 males) aged between 2.20 and 15.99 years were examined by two observers. A repeated measures ANOVA was performed to compare dental methods among gender and age groups. All of the five methods overestimated the chronological age on the average. Among these, Willems method was found to be the most accurate method, which showed 0.07 and 0.15 years overestimation for males and females, respectively. It was followed by Demirjian's four teeth methods, revised and original methods. According to the results, Willems method can be recommended for dental age estimation of Turkish children in forensic applications. (C) 2015 Elsevier Ireland Ltd. All rights reserved

    A preliminary study of dental patterns in panoramic radiography for forensic identification

    No full text
    Fingerprints, DNA, and dentition are the principal markers used for forensic identification. Frequently used dental characteristics for identification include evidence of dental procedures, such as restorations, root canal therapy, crowns, and extractions. The purposes of this preliminary study were to define dental parameters in panoramic radiographs to generate dental patterns for forensic identification, to evaluate intra-and inter-observer effects on the assessment of these parameters, and to determine the optimum number of parameters to be used in dental coding for diversity studies. In total, 11 dental parameters (virgin, missing, filling, crown, defect, residual root, bridge pontic, dental implant, endodontic treatment, impacted, and dental anomaly) were defined and the details of the coding were shown. Based on the definition of the specified parameters, dental patterns were determined from 169 panoramic radiographs. Overall, intra-and inter-observer agreements were 97.48% and 94.48%, respectively. The effects of each parameter on diversity were evaluated. When 4 and 6 base parameters and all 11 parameters were used, the diversities for full dentition were 99.31%, 99.95%, and 99.95% respectively. It was concluded that from panoramic radiographs with the 11 specified parameters, an optimum number of 6 parameters (virgin, missing, filling, crown, defect, and impacted) can be used readily and reliably to study the diversity of dental patterns for forensic identification

    geneSurv: An interactive web-based tool for survival analysis in genomics research

    No full text
    Survival analysis methods are often used in cancer studies. It has been shown that the combination of clinical data with genomics increases the predictive performance of survival analysis methods. But, this leads to a high-dimensional data problem. Fortunately, new methods have been developed in the last decade to overcome this problem. However, there is a strong need for easily accessible, user-friendly and interactive tool to perform survival analysis in the presence of genomics data. We developed an open-source and freely available web-based tool for survival analysis methods that can deal with high-dimensional data. This tool includes classical methods, such as Kaplan-Meier, Cox proportional hazards regression, and advanced methods, such as penalized Cox regression and Random Survival Forests. It also offers an optimal cutoff determination method based on maximizing several test statistics. The tool has a simple and interactive interface, and it can handle high dimensional data through feature selection and ensemble methods. To dichotomize gene expressions, geneSurv can identify optimal cutoff points. Users can upload their microarray, RNA-Seq, chip-Seq, proteomics, metabolomics or clinical data as a nxp dimensional data matrix, where n refers to samples and p refers to genes. This tool is available free at www.biosoft.hacettepe.edu.tr/geneSurv. All source code is available at https://github.com/selcukorkmaz/geneSurv under the GPL-3 license

    easyROC: An Interactive Web-tool for ROC Curve Analysis Using R Language Environment

    No full text
    ROC curve analysis is a fundamental tool for evaluating the performance of a marker in a number of research areas, e.g., biomedicine, bioinformatics, engineering etc., and is frequently used for discriminating cases from controls. There are a number of analysis tools which are used to guide researchers through their analysis. Some of these tools are commercial and provide basic methods for ROC curve analysis while others offer advanced analysis techniques and a command-based user interface, such as the R environment. The R environmentg includes comprehensive tools for ROC curve analysis; however, using a command-based interface might be challenging and time consuming when a quick evaluation is desired; especially for non-R users, physicians etc. Hence, a quick, comprehensive, free and easy-to-use analysis tool is required. For this purpose, we developed a user-friendly web-tool based on the R language. This tool provides ROC statistics, graphical tools, optimal cutpoint calculation, comparison of several markers, and sample size estimation to support researchers in their decisions without writing R codes. easyROC can be used via any device with an internet connection independently of the operating system. The web interface of easyROC is constructed with the R package shiny. This tool is freely available through www.biosoft.hacettepe.edu.tr/easyROC

    Factors effecting the model performance measures area under the ROC curve, net reclassification improvement and integrated discrimination improvement

    No full text
    The aim of this study is to investigate the impact of correlation structure, prevalence and effect size on the risk prediction model by using the change in the area under the receiver operating characteristic curve (Delta AUC), net reclassification improvement (NRI), and integrated discrimination improvement (IDI). In simulation study, the dataset is generated under different correlation structures, prevalences and effect sizes. We verify the simulation results with the real-data application. In conclusion, the correlation structure between the variables should be taken into account while composing a multivariable model. Negative correlation structure between independent variables is more beneficial while constructing a model

    Accuracy of the use of radiographic visibility of root pulp in the mandibular third molar as a maturity marker at age thresholds of 18 and 21

    No full text
    Evaluation of the radiographic visibility of root pulp in mandibular third molars has been suggested as an alternative method for estimation of legal age threshold in living individuals when the root apices are mature. Here, we assessed the accuracy of this method for age thresholds of 18 and 21 years. A sample of 463 panoramic radiographs of individuals aged between 16 and 34 years was examined. The root pulp visibility of the mandibular third molars was scored; the stages ranged from 0 to 3. A receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC) were used to select optimal cut-offs for 18- and 21-year-old thresholds. As prognostic predictors, the selected cut-offs were stages 1 and 2 for the 18- and 21-year-old thresholds of both sexes, respectively. For the 18-year-old threshold, the AUC, sensitivity and specificity were 0.829, 83.1% and 66.7% in females; and 0.930, 89.4% and 90.9% in males, respectively. For the 21-year-old threshold, the AUC, sensitivity and specificity were 0.874, 72.8% and 92.0% in females; and 0.906, 85.5% and 88.2% in males, respectively. The accuracy of the method for estimating the 18- and 21-year-old thresholds ranged from moderate to high. Therefore, the method must be used in conjunction with other age estimation methods, especially to predict whether a female has reached 18 years of age

    Morphometric study of the true S1 and S2 of the normal and dysmorphic sacralized sacra

    No full text
    Background/aim: This study aimed to generate data for the S1 and S2 alar pedicle and body and the alar orientations for both dysmorphic and normal sacra
    corecore