Search CORE

78 research outputs found

Stability of Ranked Gene Lists in Large Microarray Analysis Studies

Author: Kokol Peter
Stiglic Gregor
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2010
Field of study

This paper presents an empirical study that aims to explain the relationship between the number of samples and stability of different gene selection techniques for microarray datasets. Unlike other similar studies where number of genes in a ranked gene list is variable, this study uses an alternative approach where stability is observed at different number of samples that are used for gene selection. Three different metrics of stability, including a novel metric in bioinformatics, were used to estimate the stability of the ranked gene lists. Results of this study demonstrate that the univariate selection methods produce significantly more stable ranked gene lists than the multivariate selection methods used in this study. More specifically, thousands of samples are needed for these multivariate selection methods to achieve the same level of stability any given univariate selection method can achieve with only hundreds

Crossref

Directory of Open Access Journals

PubMed Central

Digital library of University of Maribor

Gene set enrichment meta-learning analysis: next- generation sequencing versus microarrays

Author: Bajgot Mateja
Kokol Peter
Stiglic Gregor
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Background Reproducibility of results can have a significant impact on the acceptance of new technologies in gene expression analysis. With the recent introduction of the so-called next-generation sequencing (NGS) technology and established microarrays, one is able to choose between two completely different platforms for gene expression measurements. This study introduces a novel methodology for gene-ranking stability analysis that is applied to the evaluation of gene-ranking reproducibility on NGS and microarray data. Results The same data used in a well-known MicroArray Quality Control (MAQC) study was also used in this study to compare ranked lists of genes from MAQC samples A and B, obtained from Affymetrix HG-U133 Plus 2.0 and Roche 454 Genome Sequencer FLX platforms. An initial evaluation, where the percentage ofoverlapping genes was observed, demonstrates higher reproducibility on microarray data in 10 out of 11 gene-ranking methods. A gene set enrichment analysis shows similar enrichment of top gene sets when NGS is compared with microarrays on a pathway level. Our novel approach demonstrates high accuracy of decision trees when used for knowledge extraction from multiple bootstrapped gene set enrichment analysis runs. A comparison of the two approaches in sample preparation for high-throughput sequencing shows that alternating decision trees represent the optimal knowledge representation method in comparison with classical decision trees. Conclusions Usual reproducibility measurements are mostly based on statistical techniques that offer very limited biological insights into the studied gene expression data sets. This paper introduces the meta-learning-based gene set enrichment analysis that can be used to complement the analysis of gene-ranking stabilityestimation techniques such as percentage of overlapping genes or classic gene set enrichment analysis. It is useful and practical when reproducibility of gene ranking results or different gene selection techniquesis observed. The proposed method reveals very accurate descriptive models that capture the co-enrichment of gene sets which are differently enriched in the compared data sets

Springer - Publisher Connector

PubMed Central

Digital library of University of Maribor

R you ready? Using the R programme for statistical analysis and graphics

Author: Cilar Leona
Stiglic Gregor
Watson Roger
Štiglic Gregor
Publication venue: 'Wiley'
Publication date: 14/10/2019
Field of study

© 2019 Wiley Periodicals, Inc. For conducting research, nurses typically use commercial statistical packages. R software is a free, powerful, and flexible alternative, but is less familiar and used less frequently in nursing research. In this paper, we use data from a previous study to demonstrate a few typical steps in exploratory data analysis using R. A step-by-step description of some basic analyses in R is provided here, including examples of specific functions to read and manipulate the data, calculate scores from individual questionnaire items, and prepare a correlation plot and summary table

Repository@Hull - Worktribe

Development and validation of the type 2 diabetes mellitus 10-year risk score prediction models from survey data

Author: Cilar Leona
Sheikh Aziz
Stiglic Gregor
Wang Fei
Publication venue: 'Elsevier BV'
Publication date: 22/04/2021
Field of study

Edinburgh Research Explorer

Stability Selection using a Genetic Algorithm and Logistic Linear Regression on Healthcare Records

Author: Hrovat Goran
Stiglic Gregor
Zamuda Aleš
Zarges Christine
Publication venue
Publication date: 20/03/2017
Field of study

This paper presents a Genetic Algorithm (GA) application to measuring feature importance in machine learning (ML) from a large-scale database. Too many input features may cause over-fitting, therefore a feature selection is desirable. Some ML algorithms have feature selection embedded, e.g., lasso penalized linear regression or random forests. Others do not include such functionality and are sensitive to over-fitting, e.g., unregularized linear regression. The latter algorithms require that proper features are chosen before learning. Therefore, we propose a novel stability selection (SS) approach using GA-based feature selection. The proposed SS approach iteratively applies GA on a subsample of records and features. Each GA individual represents a binary vector of selected features in the subsample. An unregularized logistic linear regression model is then trained and tested using GA-selected features through cross-validation of the subsamples. GA fitness is evaluated by area under the curve (AUC) and optimized during a GA run. AUC is assessed with an unregularized logistic regression model on multiple-subsampled healthcare records, collected under the Healthcare Cost, and Utilization Project (HCUP), utilizing the National (Nationwide) Inpatient Sample (NIS) database. Reported results show that averaging feature importance from top-4 SS and the SS using GA (GASS), improves these AUC results

Crossref

Aberystwyth Research Portal

Early detection of type 2 diabetes mellitus using machine learning-based prediction models

Author: Cilar Leona
Kocbek Primoz
Kopitar Leon
Sheikh Aziz
Stiglic Gregor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/07/2020
Field of study

Edinburgh Research Explorer

Perceptions of caring between Slovene and Russian members of nursing teams

Author: Kasimovskaya Natalia
Pajnkihar Majda
Stiglic Gregor
Vrbnjak Dominika
Watson Roger
Publication venue: 'SAGE Publications'
Publication date: 01/07/2018
Field of study

Purpose: To measure the perceptions of caring between Slovene and Russian members of nursing teams and compare the results with earlier findings in other European Union (EU) countries.Methods: A cross sectional study that included nurses and nursing assistants in Slovenia (n = 294) and Russia (n = 531). Data were collected using the 25-item Caring Dimensions Inventory.Results: The most endorsed item for Slovene and Russian members of nursing teams was an item related to medication administration. All items that were endorsed by Russian participants were also endorsed by Slovenian participants; however, they ascribed a different level of importance to individual aspects of caring. Discussion: Compared with other EU countries, such as the UK and Spain, Slovenian and Russian members of nursing teams endorsed more technical aspects of nursing duties as caring, suggesting cultural differences and previous influences of the biomedical model on nursing education and practice

Repository@Hull - Worktribe

Challenges associated with missing data in electronic health records:A case study of a risk prediction model for diabetes using data from Slovenian primary care

Author: Aziz Sheikh
Gregor Stiglic
Majda Pajnkihar
Nino Fijacko
Primoz Kocbek
Srinivasan K
Publication venue: 'SAGE Publications'
Publication date: 13/10/2017
Field of study

Crossref

Edinburgh Research Explorer

The KIDSCREEN-27 scale: translation and validation study of the Slovenian version

Author: Barr Owen
Budler Leona Cilar
Pajnkihar Majda
Ravens-Sieberer Ulrike
Stiglic Gregor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Background: There are many methods available for measuring social support and quality of life (QoL) of adolescents, of these, the KIDSCREEN tools are most widely used. Thus, we aimed to translate and validate the KIDSCREEN-27 scale for the usage among adolescents aged between 10 and 19 years old in Slovenia. Methods: A cross-sectional study was conducted among 2852 adolescents in primary and secondary school from November 2019 to January 2020 in Slovenia. 6-steps method of validation was used to test psychometric properties of the KIDSCREEN-27 scale. We checked descriptive statistics, performed a Mokken scale analysis, parametric item response theory, factor analysis, classical test theory and total (sub)scale scores. Results: All five subscales of the KIDSCREEN-27 formed a unidimensional scale with good homogeneity and reliability. The confirmatory factor analysis showed poor fit in user model versus baseline model metrics (CFI = 0.847TLI = 0.862) and good fit in root mean square error (RMSEA = 0.072p(χ2) < 0.001). A scale reliability was calculated using Cronbach\u27s α (0.93), beta (0.86), G6 (0.95) and omega (0.93). Conclusions: The questionnaire showed average psychometric properties and can be used among adolescents in Slovenia to find out about their quality of life. Further research is needed to explore why fit in user model metrics is poor

Digital library of University of Maribor

Ulster University's Research Portal