78 research outputs found

    Klynger

    Get PDF
    Ikke alle datasett har forklaringsvariabler og utfall. Allikevel kan det finnes sammenhenger i dataene som er nyttige å avdekke. På 2010-tallet arbeidet Intervensjonssenteret på Rikshospitalet med å utvikle en dataalgoritme som automatisk kunne finne tumorer i et radiologisk bilde. Resultatet av dataalgoritmen var en todimensjonal geometrisk form: omrisset av en tumor. Om algoritmen fungerte eller ikke, ble fastslått ved å sammenligne omrisset fra den automatiske metoden med omriss laget manuelt av fire erfarne radiologer. En geometrisk form er matematikk, men den er ikke et tall, og å sammenligne omriss av tumorer krevde en annen kvantitativ tilnærming enn tradisjonelle statistiske metoder.publishedVersio

    Limitations of mRNA amplification from small-size cell samples

    Get PDF
    BACKGROUND: Global mRNA amplification has become a widely used approach to obtain gene expression profiles from limited material. An important concern is the reliable reflection of the starting material in the results obtained. This is especially important with extremely low quantities of input RNA where stochastic effects due to template dilution may be present. This aspect remains under-documented in the literature, as quantitative measures of data reliability are most often lacking. To address this issue, we examined the sensitivity levels of each transcript in 3 different cell sample sizes. ANOVA analysis was used to estimate the overall effects of reduced input RNA in our experimental design. In order to estimate the validity of decreasing sample sizes, we examined the sensitivity levels of each transcript by applying a novel model-based method, TransCount. RESULTS: From expression data, TransCount provided estimates of absolute transcript concentrations in each examined sample. The results from TransCount were used to calculate the Pearson correlation coefficient between transcript concentrations for different sample sizes. The correlations were clearly transcript copy number dependent. A critical level was observed where stochastic fluctuations became significant. The analysis allowed us to pinpoint the gene specific number of transcript templates that defined the limit of reliability with respect to number of cells from that particular source. In the sample amplifying from 1000 cells, transcripts expressed with at least 121 transcripts/cell were statistically reliable and for 250 cells, the limit was 1806 transcripts/cell. Above these thresholds, correlation between our data sets was at acceptable values for reliable interpretation. CONCLUSION: These results imply that the reliability of any amplification experiment must be validated empirically to justify that any gene exists in sufficient quantity in the input material. This finding has important implications for any experiment where only extremely small samples such as single cell analyses or laser captured microdissected cells are available

    Effects of mRNA amplification on gene expression ratios in cDNA experiments estimated by analysis of variance

    Get PDF
    BACKGROUND: A limiting factor of cDNA microarray technology is the need for a substantial amount of RNA per labeling reaction. Thus, 20–200 micro-grams total RNA or 0.5–2 micro-grams poly (A) RNA is typically required for monitoring gene expression. In addition, gene expression profiles from large, heterogeneous cell populations provide complex patterns from which biological data for the target cells may be difficult to extract. In this study, we chose to investigate a widely used mRNA amplification protocol that allows gene expression studies to be performed on samples with limited starting material. We present a quantitative study of the variation and noise present in our data set obtained from experiments with either amplified or non-amplified material. RESULTS: Using analysis of variance (ANOVA) and multiple hypothesis testing, we estimated the impact of amplification on the preservation of gene expression ratios. Both methods showed that the gene expression ratios were not completely preserved between amplified and non-amplified material. We also compared the expression ratios between the two cell lines for the amplified material with expression ratios between the two cell lines for the non-amplified material for each gene. With the aid of multiple t-testing with a false discovery rate of 5%, we found that 10% of the genes investigated showed significantly different expression ratios. CONCLUSION: Although the ratios were not fully preserved, amplification may prove to be extremely useful with respect to characterizing low expressing genes

    Functional studies on transfected cell microarray analysed by linear regression modelling

    Get PDF
    Transfected cell microarray is a promising method for accelerating the functional exploration of the genome, giving information about protein function in the living cell. The microarrays consist of clusters of cells (spots) overexpressing or silencing a particular gene product. The subsequent analysis of the phenotypic consequences of such perturbations can then be detected using cell-based assays. The focus in the present study was to establish an experimental design and a robust analysis approach for fluorescence intensity data, and to address the use of replicates for studying regulation of gene expression with varying complexity and effect size. Our analysis pipeline includes measurement of fluorescence intensities, normalization strategies using negative control spots and internal control plasmids, and linear regression (ANOVA) modelling for estimating biological effects and calculating P-values for comparisons of interests. Our results show the potential of transfected cell microarrays in studying complex regulation of gene expression by enabling measurement of biological responses in cells with overexpression and downregulation of specific gene products, combined with the possibility of assaying the effects of external stimuli. Simulation experiments show that transfected cell microarrays can be used to reliably detect even quantitatively minor biological effects by including several technical and experimental replicates

    GeneTools – application for functional annotation and statistical hypothesis testing

    Get PDF
    BACKGROUND: Modern biology has shifted from "one gene" approaches to methods for genomic-scale analysis like microarray technology, which allow simultaneous measurement of thousands of genes. This has created a need for tools facilitating interpretation of biological data in "batch" mode. However, such tools often leave the investigator with large volumes of apparently unorganized information. To meet this interpretation challenge, gene-set, or cluster testing has become a popular analytical tool. Many gene-set testing methods and software packages are now available, most of which use a variety of statistical tests to assess the genes in a set for biological information. However, the field is still evolving, and there is a great need for "integrated" solutions. RESULTS: GeneTools is a web-service providing access to a database that brings together information from a broad range of resources. The annotation data are updated weekly, guaranteeing that users get data most recently available. Data submitted by the user are stored in the database, where it can easily be updated, shared between users and exported in various formats. GeneTools provides three different tools: i) NMC Annotation Tool, which offers annotations from several databases like UniGene, Entrez Gene, SwissProt and GeneOntology, in both single- and batch search mode. ii) GO Annotator Tool, where users can add new gene ontology (GO) annotations to genes of interest. These user defined GO annotations can be used in further analysis or exported for public distribution. iii) eGOn, a tool for visualization and statistical hypothesis testing of GO category representation. As the first GO tool, eGOn supports hypothesis testing for three different situations (master-target situation, mutually exclusive target-target situation and intersecting target-target situation). An important additional function is an evidence-code filter that allows users, to select the GO annotations for the analysis. CONCLUSION: GeneTools is the first "all in one" annotation tool, providing users with a rapid extraction of highly relevant gene annotation data for e.g. thousands of genes or clones at once. It allows a user to define and archive new GO annotations and it supports hypothesis testing related to GO category representations. GeneTools is freely available through www.genetools.n

    Discrimination and Classification

    No full text
    The aim of this report is to present methods from statistics, neural networks, nonparametric regression and pattern recognition to perform discrimination and classification. The methods are compared on theoretical and empirical grounds to highlight strengths and weaknesses. A common platform for classification is also outlined. The emphasis is on multiple (more than two) classes. Some keywords: Supervised Classification; Discriminant Analysis; Multiple Classes; Multilayer Perceptrons (MLP). Discrimination and Classification Contents 1 Introduction 3 2 Classification 3 2.1 Decision Theoretic Framework : : : : : : : : : : : : : : : : : : : : : : : : : : 3 2.2 Allocation Principles : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 2.3 Discriminant Functions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 2.4 Constructing Classifiers : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 2.5 Evaluation Principles : : : : : : : : : : : : : : : : : : ..
    corecore