1,280 research outputs found

    Personalized copy number and segmental duplication maps using next-generation sequencing

    Get PDF
    Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 x 10(-16)). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology

    Categorization of compensatory motions in transradial myoelectric prosthesis users

    Get PDF
    Background: Prosthesis users perform various compensatory motions to accommodate for the loss of the hand and wrist as well as the reduced functionality of a prosthetic hand. Objectives: Investigate different compensation strategies that are performed by prosthesis users. Study Design: Comparative analysis Methods: 20 able-bodied subjects and 4 prosthesis users performed a set of bimanual activities. Movements of the trunk and head were recorded using a motion capture system, and a digital video recorder. Clinical motion angles were calculated to assess the compensatory motions made by the prosthesis users. The video recording also assisted in visually identifying the compensations. Results: Compensatory motions by the prosthesis users were evident in the tasks performed (slicing and stirring activities) as compared to the benchmark of able-bodied subjects. Compensations took the form of a measured increase in range of motion, an observed adoption of a new posture during task execution, and pre-positioning of items in the workspace prior to initiating a given task. Conclusion: Compensatory motions were performed by prosthesis users during the selected tasks. These can be categorized into three different types of compensations

    A cross-sectional study of Mycoplasma genitalium infection and correlates in women undergoing population-based screening or clinic-based testing for Chlamydia infection in London

    Get PDF
    To determine Mycoplasma genitalium infection and correlates among young women undergoing population-based screening or clinic-based testing for Chlamydia infection

    On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing

    Get PDF
    One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions -SVDetect, GRIAL, and VariationHunter-, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects

    Different atmospheric moisture divergence responses to extreme and moderate El Niños

    Get PDF
    On seasonal and inter-annual time scales, vertically integrated moisture divergence provides a useful measure of the tropical atmospheric hydrological cycle. It reflects the combined dynamical and thermodynamical effects, and is not subject to the limitations that afflict observations of evaporation minus precipitation. An empirical orthogonal function (EOF) analysis of the tropical Pacific moisture divergence fields calculated from the ERA-Interim reanalysis reveals the dominant effects of the El Niño-Southern Oscillation (ENSO) on inter-annual time scales. Two EOFs are necessary to capture the ENSO signature, and regression relationships between their Principal Components and indices of equatorial Pacific sea surface temperature (SST) demonstrate that the transition from strong La Niña through to extreme El Niño events is not a linear one. The largest deviation from linearity is for the strongest El Niños, and we interpret that this arises at least partly because the EOF analysis cannot easily separate different patterns of responses that are not orthogonal to each other. To overcome the orthogonality constraints, a self-organizing map (SOM) analysis of the same moisture divergence fields was performed. The SOM analysis captures the range of responses to ENSO, including the distinction between the moderate and strong El Niños identified by the EOF analysis. The work demonstrates the potential for the application of SOM to large scale climatic analysis, by virtue of its easier interpretation, relaxation of orthogonality constraints and its versatility for serving as an alternative classification method. Both the EOF and SOM analyses suggest a classification of “moderate” and “extreme” El Niños by their differences in the magnitudes of the hydrological cycle responses, spatial patterns and evolutionary paths. Classification from the moisture divergence point of view shows consistency with results based on other physical variables such as SST

    Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

    Full text link
    From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

    Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity

    Get PDF
    Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95–99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ∼15% and ∼20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing

    The Timbre Perception Test (TPT): A new interactive musical assessment tool to measure timbre perception ability

    Get PDF
    To date, tests that measure individual differences in the ability to perceive musical timbre are scarce in the published literature.The lack of such tool limits research on how timbre, a primary attribute of sound, is perceived and processed among individuals.The current paper describes the development of the Timbre Perception Test (TPT), in which participants use a slider to reproduce heard auditory stimuli that vary along three important dimensions of timbre: envelope, spectral flux, and spectral centroid. With a sample of 95 participants, the TPT was calibrated and validated against measures of related abilities and examined for its reliability. The results indicate that a short-version (8 minutes) of the TPT has good explanatory support from a factor analysis model, acceptable internal reliability (α=.69,ωt = .70), good test–retest reliability (r= .79) and substantial correlations with self-reported general musical sophistication (ρ= .63) and pitch discrimination (ρ= .56), as well as somewhat lower correlations with duration discrimination (ρ= .27), and musical instrument discrimination abilities (ρ= .33). Overall, the TPT represents a robust tool to measure an individual’s timbre perception ability. Furthermore, the use of sliders to perform a reproductive task has shown to be an effective approach in threshold testing. The current version of the TPT is openly available for research purposes
    corecore