194 research outputs found

    All Fingers Are Not the Same: Handling Variable-Length Sequences in a Discriminative Setting Using Conformal Multi-Instance Kernels

    Get PDF
    Most string kernels for comparison of genomic sequences are generally tied to using (absolute) positional information of the features in the individual sequences. This poses limitations when comparing variable-length sequences using such string kernels. For example, profiling chromatin interactions by 3C-based experiments results in variable-length genomic sequences (restriction fragments). Here, exact position-wise occurrence of signals in sequences may not be as important as in the scenario of analysis of the promoter sequences, that typically have a transcription start site as reference. Existing position-aware string kernels have been shown to be useful for the latter scenario. In this work, we propose a novel approach for sequence comparison that enables larger positional freedom than most of the existing approaches, can identify a possibly dispersed set of features in comparing variable-length sequences, and can handle both the aforementioned scenarios. Our approach, emph{CoMIK}, identifies not just the features useful towards classification but also their locations in the variable-length sequences, as evidenced by the results of three binary classification experiments, aided by recently introduced visualization techniques. Furthermore, we show that we are able to efficiently retrieve and interpret the weight vector for the complex setting of multiple multi-instance kernels

    ESCAPED: Efficient Secure and Private Dot Product Framework for Kernel-based Machine Learning Algorithms with Applications in Healthcare

    Full text link
    To train sophisticated machine learning models one usually needs many training samples. Especially in healthcare settings these samples can be very expensive, meaning that one institution alone usually does not have enough on its own. Merging privacy-sensitive data from different sources is usually restricted by data security and data protection measures. This can lead to approaches that reduce data quality by putting noise onto the variables (e.g., in ϵ\epsilon-differential privacy) or omitting certain values (e.g., for kk-anonymity). Other measures based on cryptographic methods can lead to very time-consuming computations, which is especially problematic for larger multi-omics data. We address this problem by introducing ESCAPED, which stands for Efficient SeCure And PrivatE Dot product framework, enabling the computation of the dot product of vectors from multiple sources on a third-party, which later trains kernel-based machine learning algorithms, while neither sacrificing privacy nor adding noise. We evaluated our framework on drug resistance prediction for HIV-infected people and multi-omics dimensionality reduction and clustering problems in precision medicine. In terms of execution time, our framework significantly outperforms the best-fitting existing approaches without sacrificing the performance of the algorithm. Even though we only show the benefit for kernel-based algorithms, our framework can open up new research opportunities for further machine learning models that require the dot product of vectors from multiple sources.Comment: AAAI 2021, Preprint version of the full paper with supplementary materia

    Robust Representation Learning for Privacy-Preserving Machine Learning: A Multi-Objective Autoencoder Approach

    Full text link
    Several domains increasingly rely on machine learning in their applications. The resulting heavy dependence on data has led to the emergence of various laws and regulations around data ethics and privacy and growing awareness of the need for privacy-preserving machine learning (ppML). Current ppML techniques utilize methods that are either purely based on cryptography, such as homomorphic encryption, or that introduce noise into the input, such as differential privacy. The main criticism given to those techniques is the fact that they either are too slow or they trade off a model s performance for improved confidentiality. To address this performance reduction, we aim to leverage robust representation learning as a way of encoding our data while optimizing the privacy-utility trade-off. Our method centers on training autoencoders in a multi-objective manner and then concatenating the latent and learned features from the encoding part as the encoded form of our data. Such a deep learning-powered encoding can then safely be sent to a third party for intensive training and hyperparameter tuning. With our proposed framework, we can share our data and use third party tools without being under the threat of revealing its original form. We empirically validate our results on unimodal and multimodal settings, the latter following a vertical splitting system and show improved performance over state-of-the-art

    OpenMS - A Framework for Quantitative HPLC/MS-Based Proteomics

    Get PDF
    In the talk we describe the freely available software library OpenMS which is currently under development at the Freie Universität Berlin and the Eberhardt-Karls Universität Tübingen. We give an overview of the goals and problems in differential proteomics with HPLC and then describe in detail the implemented approaches for signal processing, peak detection and data reduction currently employed in OpenMS. After this we describe methods to identify the differential expression of peptides and propose strategies to avoid MS/MS identification of peptides of interest. We give an overview of the capabilities and design principles of OpenMS and demonstrate its ease of use. Finally we describe projects in which OpenMS will be or was already deployed and thereby demonstrate its versatility

    Which non-infection related risk factors are associated with impaired proximal femur fracture healing in patients under the age of 70 years?

    Full text link
    BACKGROUND/PURPOSE Impaired healing is a feared complication with devastating outcomes for each patient. Most studies focus on geriatric fracture fixation and assess well known risk factors such as infections. However, risk factors, others than infections, and impaired healing of proximal femur fractures in non-geriatric adults are marginally assessed. Therefore, this study aimed to identify non-infection related risk factors for impaired fracture healing of proximal femur fractures in non-geriatric trauma patients. METHODS This study included non-geriatric patients (aged 69 years and younger) who were treated between 2013 and 2020 at one academic Level 1 trauma center due to a proximal femur fracture (PFF). Patients were stratified according to AO/OTA classification. Delayed union was defined as failed callus formation on 3 out of 4 cortices after 3 to 6 months. Nonunion was defined as lack of callus-formation after 6 months, material breakage, or requirement of revision surgery. Patient follow up was 12 months. RESULTS This study included 150 patients. Delayed union was observed in 32 (21.3%) patients and nonunion with subsequent revision surgery occurred in 14 (9.3%). With an increasing fracture classification (31 A1 up to 31 A3 type fractures), there was a significantly higher rate of delayed union. Additionally, open reduction and internal fixation (ORIF) (OR 6.17, (95% CI 1.54 to 24.70, p ≤ 0.01)) and diabetes mellitus type II (DM) (OR 5.74, (95% CI 1.39 to 23.72, p = 0.016)), were independent risk factors for delayed union. The rate of nonunion was independent of fracture morphology, patient's characteristics or comorbidities. CONCLUSION Increasing fracture complexity, ORIF and diabetes were found to be associated with delayed union of intertrochanteric femur fractures in non-geriatric patients. However, these factors were not associated with the development of nonunion

    Trochanteric fracture pattern is associated with increased risk for nonunion independent of open or closed reduction technique

    Full text link
    PURPOSE Soft tissue injury and soft tissue injury as risk factors for nonunion following trochanteric femur fractures (TFF) are marginally investigated. The aim of this study was to identify risk factors for impaired fracture healing in geriatric trauma patients with TFF following surgical treatment with a femoral nail. METHODS This retrospective cohort study included geriatric trauma patients (aged > 70 years) with TFF who were treated with femoral nailing. Fractures were classified according to AO/OTA. Nonunion was defined as lack of callus-formation after 6 months, material breakage, and requirement of revision surgery. Risk factors for nonunion included variables of clinical interest (injury pattern, demographics, comorbidities), as well as type of approach (open versus closed) and were assessed with uni- and multivariate regression analyses. RESULTS This study included 225 geriatric trauma patients. Nonunion was significantly more frequently following AO/OTA 31A3 fractures (N = 10, 23.3%) compared with AO/OTA type 31A2 (N = 6, 6.9%) or AO/OTA 31A1 (N = 3, 3.2%, p < 0.001). Type 31A3 fractures had an increased risk for nonunion compared with type 31A1 (OR 10.3 95%CI 2.2 to 48.9, p = 0.003). Open reduction was not associated with increased risk for nonunion (OR 0.9, 95%CI 0.1 to 6.1. p = 0.942) as was not the use of cerclage (OR 1.0, 95%CI 0.2 to 6.5, p = 0.995). Factors such as osteoporosis, polytrauma or diabetes were not associated with delayed union or nonunion. CONCLUSION The fracture morphology of TFF is an independent risk factor for nonunion in geriatric patients. The reduction technique is not associated with increased risk for nonunion, despite increased soft tissue damage following open reduction

    A genotypic method for determining HIV-2 coreceptor usage enables epidemiological studies and clinical decision support

    Get PDF
    Background: CCR5-coreceptor antagonists can be used for treating HIV-2 infected individuals. Before initiating treatment with coreceptor antagonists, viral coreceptor usage should be determined to ensure that the virus can use only the CCR5 coreceptor (R5) and cannot evade the drug by using the CXCR4 coreceptor (X4-capable). However, until now, no online tool for the genotypic identification of HIV-2 coreceptor usage had been available. Furthermore, there is a lack of knowledge on the determinants of HIV-2 coreceptor usage. Therefore, we developed a data-driven web service for the prediction of HIV-2 coreceptor usage from the V3 loop of the HIV-2 glycoprotein and used the tool to identify novel discriminatory features of X4-capable variants. Results: Using 10 runs of tenfold cross validation, we selected a linear support vector machine (SVM) as the model for geno2pheno[coreceptor-hiv2], because it outperformed the other SVMs with an area under the ROC curve (AUC) of 0.95. We found that SVMs were highly accurate in identifying HIV-2 coreceptor usage, attaining sensitivities of 73.5% and specificities of 96% during tenfold nested cross validation. The predictive performance of SVMs was not significantly different (p value 0.37) from an existing rules-based approach. Moreover, geno2pheno[coreceptor-hiv2] achieved a predictive accuracy of 100% and outperformed the existing approach on an independent data set containing nine new isolates with corresponding phenotypic measurements of coreceptor usage. geno2pheno[coreceptor-hiv2] could not only reproduce the established markers of CXCR4-usage, but also revealed novel markers: the substitutions 27K, 15G, and 8S were significantly predictive of CXCR4 usage. Furthermore, SVMs trained on the amino-acid sequences of the V1 and V2 loops were also quite accurate in predicting coreceptor usage (AUCs of 0.84 and 0.65, respectively). Conclusions: In this study, we developed geno2pheno[coreceptor-hiv2], the first online tool for the prediction of HIV-2 coreceptor usage from the V3 loop. Using our method, we identified novel amino-acid markers of X4-capable variants in the V3 loop and found that HIV-2 coreceptor usage is also influenced by the V1/V2 region. The tool can aid clinicians in deciding whether coreceptor antagonists such as maraviroc are a treatment option and enables epidemiological studies investigating HIV-2 coreceptor usage. geno2pheno[coreceptor-hiv2] is freely available at http://coreceptor-hiv2.geno2pheno.org

    EuCARE-POSTCOVID Study: a multicentre cohort study on long-term post-COVID-19 manifestations

    Get PDF
    BACKGROUND: Post-COVID-19 condition refers to persistent or new onset symptoms occurring three months after acute COVID-19, which are unrelated to alternative diagnoses. Symptoms include fatigue, breathlessness, palpitations, pain, concentration difficulties ("brain fog"), sleep disorders, and anxiety/depression. The prevalence of post-COVID-19 condition ranges widely across studies, affecting 10-20% of patients and reaching 50-60% in certain cohorts, while the associated risk factors remain poorly understood. METHODS: This multicentre cohort study, both retrospective and prospective, aims to assess the incidence and risk factors of post-COVID-19 condition in a cohort of recovered patients. Secondary objectives include evaluating the association between circulating SARS-CoV-2 variants and the risk of post-COVID-19 condition, as well as assessing long-term residual organ damage (lung, heart, central nervous system, peripheral nervous system) in relation to patient characteristics and virology (variant and viral load during the acute phase). Participants will include hospitalised and outpatient COVID-19 patients diagnosed between 01/03/2020 and 01/02/2025 from 8 participating centres. A control group will consist of hospitalised patients with respiratory infections other than COVID-19 during the same period. Patients will be followed up at the post-COVID-19 clinic of each centre at 2-3, 6-9, and 12-15 months after clinical recovery. Routine blood exams will be conducted, and patients will complete questionnaires to assess persisting symptoms, fatigue, dyspnoea, quality of life, disability, anxiety and depression, and post-traumatic stress disorders. DISCUSSION: This study aims to understand post-COVID-19 syndrome's incidence and predictors by comparing pandemic waves, utilising retrospective and prospective data. Gender association, especially the potential higher prevalence in females, will be investigated. Symptom tracking via questionnaires and scales will monitor duration and evolution. Questionnaires will also collect data on vaccination, reinfections, and new health issues. Biological samples will enable future studies on post-COVID-19 sequelae mechanisms, including inflammation, immune dysregulation, and viral reservoirs. TRIAL REGISTRATION: This study has been registered with ClinicalTrials.gov under the identifier NCT05531773

    OpenMS – An open-source software framework for mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.</p> <p>Results</p> <p>We present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.</p> <p>Conclusion</p> <p>OpenMS is available under the Lesser GNU Public License (LGPL) from the project website at <url>http://www.openms.de</url>.</p
    • …
    corecore