Search CORE

30,066 research outputs found

Analyzing and Visualizing State Sequences in R with TraMineR

Author: Alexis Gabadinho
Gilbert Ritschard
Matthias Studer
Nicolas S Müller
Publication venue
Publication date
Field of study

This article describes the many capabilities offered by the TraMineR toolbox for categorical sequence data. It focuses more specifically on the analysis and rendering of state sequences. Addressed features include the description of sets of sequences by means of transversal aggregated views, the computation of longitudinal characteristics of individual sequences and the measure of pairwise dissimilarities. Special emphasis is put on the multiple ways of visualizing sequences. The core element of the package is the state se- quence object in which we store the set of sequences together with attributes such as the alphabet, state labels and the color palette. The functions can then easily retrieve this information to ensure presentation homogeneity across all printed and graphical displays. The article also demonstrates how TraMineRÃ¢ÂÂs outcomes give access to advanced analyses such as clustering and statistical modeling of sequence data.

Research Papers in Economics

Recommended from our members

Efficacy of metabarcoding for identification of fish eggs evaluated with mock communities.

Author: Burton Ronald S
Duke Elena M
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

There is urgent need for effective and efficient monitoring of marine fish populations. Monitoring eggs and larval fish may be more informative than that traditional fish surveys since ichthyoplankton surveys reveal the reproductive activities of fish populations, which directly impact their population trajectories. Ichthyoplankton surveys have turned to molecular methods (DNA barcoding & metabarcoding) for identification of eggs and larval fish due to challenges of morphological identification. In this study, we examine the effectiveness of using metabarcoding methods on mock communities of known fish egg DNA. We constructed six mock communities with known ratios of species. In addition, we analyzed two samples from a large field collection of fish eggs and compared metabarcoding results with traditional DNA barcoding results. We examine the ability of our metabarcoding methods to detect species and relative proportion of species identified in each mock community. We found that our metabarcoding methods were able to detect species at very low input proportions; however, levels of successful detection depended on the markers used in amplification, suggesting that the use of multiple markers is desirable. Variability in our quantitative results may result from amplification bias as well as interspecific variation in mitochondrial DNA copy number. Our results demonstrate that there remain significant challenges to using metabarcoding for estimating proportional species composition; however, the results provide important insights into understanding how to interpret metabarcoding data. This study will aid in the continuing development of efficient molecular methods of biological monitoring for fisheries management

eScholarship - University of California

ISOWN: accurate somatic mutation identification in the absence of normal tissue controls.

Author: Bartlett John MS
Kalatskaya Irina
McPherson John D
Spears Melanie
Stein Lincoln
Trinh Quang M
Publication venue: eScholarship, University of California
Publication date: 01/06/2017
Field of study

BackgroundA key step in cancer genome analysis is the identification of somatic mutations in the tumor. This is typically done by comparing the genome of the tumor to the reference genome sequence derived from a normal tissue taken from the same donor. However, there are a variety of common scenarios in which matched normal tissue is not available for comparison.ResultsIn this work, we describe an algorithm to distinguish somatic single nucleotide variants (SNVs) in next-generation sequencing data from germline polymorphisms in the absence of normal samples using a machine learning approach. Our algorithm was evaluated using a family of supervised learning classifications across six different cancer types and ~1600 samples, including cell lines, fresh frozen tissues, and formalin-fixed paraffin-embedded tissues; we tested our algorithm with both deep targeted and whole-exome sequencing data. Our algorithm correctly classified between 95 and 98% of somatic mutations with F1-measure ranges from 75.9 to 98.6% depending on the tumor type. We have released the algorithm as a software package called ISOWN (Identification of SOmatic mutations Without matching Normal tissues).ConclusionsIn this work, we describe the development, implementation, and validation of ISOWN, an accurate algorithm for predicting somatic mutations in cancer tissues in the absence of matching normal tissues. ISOWN is available as Open Source under Apache License 2.0 from https://github.com/ikalatskaya/ISOWN

University of Toronto Research Repository

Directory of Open Access Journals

eScholarship - University of California

Memory, language and intellectual ability in low-functioning autism

Author: Bigham Sally
Boucher J.
Mayes A.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2008
Field of study

Bournemouth University Research Online

The University of Manchester - Institutional Repository

GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data

Author: Gordon SV
Hernández B
MacHugh DE
Magee DA
McGettigan PA
Nalpas NC
Parnell AC
Rue-Albrecht K
Publication venue: BioMed Central
Publication date: 25/02/2016
Field of study

Background: Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. Results: We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. Conclusions: GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.Department of Agriculture, Food and the MarineEuropean Commission - Seventh Framework Programme (FP7)Science Foundation IrelandUniversity College Dubli

Research Repository UCD

ZENODO

Springer - Publisher Connector

Irish Universities

PubMed Central

Spiral - Imperial College Digital Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY