8 research outputs found

    omniCLIP: probabilistic identification of protein-RNA interactions from CLIP-seq data

    Get PDF
    CLIP-seq methods allow the generation of genome-wide maps of RNA binding protein - RNA interaction sites. However, due to differences between different CLIP-seq assays, existing computational approaches to analyze the data can only be applied to a subset of assays. Here, we present a probabilistic model called omniCLIP that can detect regulatory elements in RNAs from data of all CLIP-seq assays. omniCLIP jointly models data across replicates and can integrate background information. Therefore, omniCLIP greatly simplifies the data analysis, increases the reliability of results and paves the way for integrative studies based on data from different assays

    Deep learning for prediction of population health costs

    Get PDF
    BACKGROUND: Accurate prediction of healthcare costs is important for optimally managing health costs. However, methods leveraging the medical richness from data such as health insurance claims or electronic health records are missing. METHODS: Here, we developed a deep neural network to predict future cost from health insurance claims records. We applied the deep network and a ridge regression model to a sample of 1.4 million German insurants to predict total one-year health care costs. Both methods were compared to existing models with various performance measures and were also used to predict patients with a change in costs and to identify relevant codes for this prediction. RESULTS: We showed that the neural network outperformed the ridge regression as well as all considered models for cost prediction. Further, the neural network was superior to ridge regression in predicting patients with cost change and identified more specific codes. CONCLUSION: In summary, we showed that our deep neural network can leverage the full complexity of the patient records and outperforms standard approaches. We suggest that the better performance is due to the ability to incorporate complex interactions in the model and that the model might also be used for predicting other health phenotypes

    Alternative splicing substantially diversifies the transcriptome during early photomorphogenesis and correlates with the energy availability in arabidopsis

    Get PDF
    Plants use light as source of energy and information to detect diurnal rhythms and seasonal changes. Sensing changing light conditions is critical to adjust plant metabolism and to initiate developmental transitions. Here we analyzed transcriptome-wide alterations in gene expression and alternative splicing (AS) of etiolated seedlings undergoing photomorphogenesis upon exposure to blue, red, or white light. Our analysis revealed massive transcriptome reprograming as reflected by differential expression of ~20% of all genes and changes in several hundred AS events. For more than 60% of all regulated AS events, light promoted the production of a presumably protein-coding variant at the expense of an mRNA with nonsense-mediated decay-triggering features. Accordingly, AS of the putative splicing factor REDUCED RED-LIGHT RESPONSES IN CRY1CRY2 BACKGROUND 1 (RRC1), previously identified as a red light signaling component, was shifted to the functional variant under light. Downstream analyses of candidate AS events pointed at a role of photoreceptor signaling only in monochromatic but not in white light. Furthermore, we demonstrated similar AS changes upon light exposure and exogenous sugar supply, with a critical involvement of kinase signaling. We propose that AS is an integration point of signaling pathways that sense and transmit information regarding the energy availability in plants

    Using gradient boosting with stability selection on health insurance claims data to identify disease trajectories in chronic obstructive pulmonary disease

    No full text
    OBJECTIVE: We propose a data-driven method to detect temporal patterns of disease progression in high-dimensional claims data based on gradient boosting with stability selection. MATERIALS AND METHODS: We identified patients with chronic obstructive pulmonary disease in a German health insurance claims database with 6.5 million individuals and divided them into a group of patients with the highest disease severity and a group of control patients with lower severity. We then used gradient boosting with stability selection to determine variables correlating with a chronic obstructive pulmonary disease diagnosis of highest severity and subsequently model the temporal progression of the disease using the selected variables. RESULTS: We identified a network of 20 diagnoses (e.g. respiratory failure), medications (e.g. anticholinergic drugs) and procedures associated with a subsequent chronic obstructive pulmonary disease diagnosis of highest severity. Furthermore, the network successfully captured temporal patterns, such as disease progressions from lower to higher severity grades. DISCUSSION: The temporal trajectories identified by our data-driven approach are compatible with existing knowledge about chronic obstructive pulmonary disease showing that the method can reliably select relevant variables in a high-dimensional context. CONCLUSION: We provide a generalizable approach for the automatic detection of disease trajectories in claims data. This could help to diagnose diseases early, identify unknown risk factors and optimize treatment plans

    Expanding the map of protein-RNA interaction sites via cell fusion followed by PAR-CLIP

    No full text
    PAR-CLIP (photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation) facilitates the identification and mapping of protein/RNA interactions. So far, it has been limited to select cell-lines as it requires efficient 4SU uptake. To increase transcriptome complexity and thus identify additional RNA-protein interaction sites we fused HEK 293 T-Rex cells (HEK293-Y) that express the RNA binding protein YBX1 with PC12 cells expressing eGFP (PC12-eGFP). The resulting hybrids enable PAR-CLIP on a neuronally expanded transcriptome (Fusion-CLIP) and serve as a proof of principle. The fusion cells express both parental marker genes YBX1 and eGFP and the expanded transcriptome contains human and rat transcripts. PAR-CLIP of fused cells versus the parental HEK293-Y identified 768 novel RNA targets of YBX1. We were able to trace the origin of the majority of the short PAR-CLIP reads as they differentially mapped to the human and rat genome. Furthermore, Fusion-CLIP expanded the CAUC RNA binding motif of YBX1 to UCUUUNNCAUC. The fusion of HEK293-Y and PC12-eGFP cells resulted in cells with a diverse genome expressing human and rat transcripts that enabled the identification of novel YBX1 substrates. The technique allows the expansion of the HEK 293 transcriptome and makes PAR-CLIP available to fusion cells of diverse origin

    SLM2 is a novel cardiac splicing factor involved in heart failure due to dilated cardiomyopathy

    No full text
    Alternative mRNA splicing is a fundamental process to increase the versatility of the genome. In humans, cardiac mRNA splicing is involved in the pathophysiology of heart failure. Mutations in the splicing factor RNA binding motif protein 20 (RBM20) cause severe forms of cardiomyopathy. To identify novel cardiomyopathy-associated splicing factors, RNA-seq and tissue-enrichment analysis were performed, which identified upregulation of Sam68-Like Mammalian Protein 2 (SLM2) in the left ventricle of dilated cardiomyopathy (DCM) patients. In the human heart, SLM2 binds to important transcripts of sarcomere constituents, such as myosin light chain 2 (MYL2), troponin I3 (TNNI3), troponin T2 (TNNT2), tropomyosin 1/2 (TPM1/2), and titin (TTN). Mechanistically, SLM2 mediates intron retention, prevents exon exclusion, and thereby mediates alternative splicing of the mRNA regions encoding the variable proline-, glutamate-, valine-, and lysine-rich (PEVK) domain and another part of the I-band region of titin. In summary, SLM2 is a novel cardiac splicing regulator with essential functions for maintaining cardiomyocyte integrity by binding and processing the mRNA of essential cardiac constituents such as titin

    Mapping the Various Meanings of Social Innovation: Towards a Differentiated Understanding of an Emerging Concept

    No full text
    corecore