198 research outputs found
All Fingers Are Not the Same: Handling Variable-Length Sequences in a Discriminative Setting Using Conformal Multi-Instance Kernels
Most string kernels for comparison of genomic sequences are generally tied to using (absolute) positional information of the features in the individual sequences. This poses limitations when comparing variable-length sequences using such string kernels. For example, profiling chromatin interactions by 3C-based experiments results in variable-length genomic sequences (restriction fragments). Here, exact position-wise occurrence of signals in sequences may not be as important as in the scenario of analysis of the promoter sequences, that typically have a transcription start site as reference. Existing position-aware string kernels have been shown to be useful for the latter scenario.
In this work, we propose a novel approach for sequence comparison that enables larger positional freedom than most of the existing approaches, can identify a possibly dispersed set of features in comparing variable-length sequences, and can handle both the aforementioned scenarios. Our approach, emph{CoMIK}, identifies not just the features useful towards classification but also their locations in the variable-length sequences, as evidenced by the results of three binary classification experiments, aided by recently introduced visualization techniques. Furthermore, we show that we are able to efficiently retrieve and interpret the weight vector for the complex setting of multiple multi-instance kernels
ESCAPED: Efficient Secure and Private Dot Product Framework for Kernel-based Machine Learning Algorithms with Applications in Healthcare
To train sophisticated machine learning models one usually needs many
training samples. Especially in healthcare settings these samples can be very
expensive, meaning that one institution alone usually does not have enough on
its own. Merging privacy-sensitive data from different sources is usually
restricted by data security and data protection measures. This can lead to
approaches that reduce data quality by putting noise onto the variables (e.g.,
in -differential privacy) or omitting certain values (e.g., for
-anonymity). Other measures based on cryptographic methods can lead to very
time-consuming computations, which is especially problematic for larger
multi-omics data. We address this problem by introducing ESCAPED, which stands
for Efficient SeCure And PrivatE Dot product framework, enabling the
computation of the dot product of vectors from multiple sources on a
third-party, which later trains kernel-based machine learning algorithms, while
neither sacrificing privacy nor adding noise. We evaluated our framework on
drug resistance prediction for HIV-infected people and multi-omics
dimensionality reduction and clustering problems in precision medicine. In
terms of execution time, our framework significantly outperforms the
best-fitting existing approaches without sacrificing the performance of the
algorithm. Even though we only show the benefit for kernel-based algorithms,
our framework can open up new research opportunities for further machine
learning models that require the dot product of vectors from multiple sources.Comment: AAAI 2021, Preprint version of the full paper with supplementary
materia
Robust Representation Learning for Privacy-Preserving Machine Learning: A Multi-Objective Autoencoder Approach
Several domains increasingly rely on machine learning in their applications.
The resulting heavy dependence on data has led to the emergence of various laws
and regulations around data ethics and privacy and growing awareness of the
need for privacy-preserving machine learning (ppML). Current ppML techniques
utilize methods that are either purely based on cryptography, such as
homomorphic encryption, or that introduce noise into the input, such as
differential privacy. The main criticism given to those techniques is the fact
that they either are too slow or they trade off a model s performance for
improved confidentiality. To address this performance reduction, we aim to
leverage robust representation learning as a way of encoding our data while
optimizing the privacy-utility trade-off. Our method centers on training
autoencoders in a multi-objective manner and then concatenating the latent and
learned features from the encoding part as the encoded form of our data. Such a
deep learning-powered encoding can then safely be sent to a third party for
intensive training and hyperparameter tuning. With our proposed framework, we
can share our data and use third party tools without being under the threat of
revealing its original form. We empirically validate our results on unimodal
and multimodal settings, the latter following a vertical splitting system and
show improved performance over state-of-the-art
OpenMS - A Framework for Quantitative HPLC/MS-Based Proteomics
In the talk we describe the freely available software library OpenMS which is
currently under development at the Freie Universität Berlin and the
Eberhardt-Karls Universität Tübingen. We give an overview of the goals and
problems in differential proteomics with HPLC and then describe in detail the
implemented approaches for signal processing, peak detection and data
reduction currently employed in OpenMS. After this we describe methods to
identify the differential expression of peptides and propose strategies to avoid MS/MS identification of peptides of interest. We give an overview of the
capabilities and design principles of OpenMS and demonstrate its ease of use.
Finally we describe projects in which OpenMS will be or was already deployed
and thereby demonstrate its versatility
Which non-infection related risk factors are associated with impaired proximal femur fracture healing in patients under the age of 70Â years?
BACKGROUND/PURPOSE
Impaired healing is a feared complication with devastating outcomes for each patient. Most studies focus on geriatric fracture fixation and assess well known risk factors such as infections. However, risk factors, others than infections, and impaired healing of proximal femur fractures in non-geriatric adults are marginally assessed. Therefore, this study aimed to identify non-infection related risk factors for impaired fracture healing of proximal femur fractures in non-geriatric trauma patients.
METHODS
This study included non-geriatric patients (aged 69Â years and younger) who were treated between 2013 and 2020 at one academic Level 1 trauma center due to a proximal femur fracture (PFF). Patients were stratified according to AO/OTA classification. Delayed union was defined as failed callus formation on 3 out of 4 cortices after 3 to 6Â months. Nonunion was defined as lack of callus-formation after 6Â months, material breakage, or requirement of revision surgery. Patient follow up was 12Â months.
RESULTS
This study included 150 patients. Delayed union was observed in 32 (21.3%) patients and nonunion with subsequent revision surgery occurred in 14 (9.3%). With an increasing fracture classification (31 A1 up to 31 A3 type fractures), there was a significantly higher rate of delayed union. Additionally, open reduction and internal fixation (ORIF) (OR 6.17, (95% CI 1.54 to 24.70, p ≤ 0.01)) and diabetes mellitus type II (DM) (OR 5.74, (95% CI 1.39 to 23.72, p = 0.016)), were independent risk factors for delayed union. The rate of nonunion was independent of fracture morphology, patient's characteristics or comorbidities.
CONCLUSION
Increasing fracture complexity, ORIF and diabetes were found to be associated with delayed union of intertrochanteric femur fractures in non-geriatric patients. However, these factors were not associated with the development of nonunion
Trochanteric fracture pattern is associated with increased risk for nonunion independent of open or closed reduction technique
PURPOSE
Soft tissue injury and soft tissue injury as risk factors for nonunion following trochanteric femur fractures (TFF) are marginally investigated. The aim of this study was to identify risk factors for impaired fracture healing in geriatric trauma patients with TFF following surgical treatment with a femoral nail.
METHODS
This retrospective cohort study included geriatric trauma patients (aged > 70 years) with TFF who were treated with femoral nailing. Fractures were classified according to AO/OTA. Nonunion was defined as lack of callus-formation after 6 months, material breakage, and requirement of revision surgery. Risk factors for nonunion included variables of clinical interest (injury pattern, demographics, comorbidities), as well as type of approach (open versus closed) and were assessed with uni- and multivariate regression analyses.
RESULTS
This study included 225 geriatric trauma patients. Nonunion was significantly more frequently following AO/OTA 31A3 fractures (N = 10, 23.3%) compared with AO/OTA type 31A2 (N = 6, 6.9%) or AO/OTA 31A1 (N = 3, 3.2%, p < 0.001). Type 31A3 fractures had an increased risk for nonunion compared with type 31A1 (OR 10.3 95%CI 2.2 to 48.9, p = 0.003). Open reduction was not associated with increased risk for nonunion (OR 0.9, 95%CI 0.1 to 6.1. p = 0.942) as was not the use of cerclage (OR 1.0, 95%CI 0.2 to 6.5, p = 0.995). Factors such as osteoporosis, polytrauma or diabetes were not associated with delayed union or nonunion.
CONCLUSION
The fracture morphology of TFF is an independent risk factor for nonunion in geriatric patients. The reduction technique is not associated with increased risk for nonunion, despite increased soft tissue damage following open reduction
A genotypic method for determining HIV-2 coreceptor usage enables epidemiological studies and clinical decision support
Background: CCR5-coreceptor antagonists can be used for treating HIV-2 infected individuals. Before initiating treatment with coreceptor antagonists, viral coreceptor usage should be determined to ensure that the virus can use only the CCR5 coreceptor (R5) and cannot evade the drug by using the CXCR4 coreceptor (X4-capable). However, until now, no online tool for the genotypic identification of HIV-2 coreceptor usage had been available. Furthermore, there is a lack of knowledge on the determinants of HIV-2 coreceptor usage. Therefore, we developed a data-driven web service for the prediction of HIV-2 coreceptor usage from the V3 loop of the HIV-2 glycoprotein and used the tool to identify novel discriminatory features of X4-capable variants. Results: Using 10 runs of tenfold cross validation, we selected a linear support vector machine (SVM) as the model for geno2pheno[coreceptor-hiv2], because it outperformed the other SVMs with an area under the ROC curve (AUC) of 0.95. We found that SVMs were highly accurate in identifying HIV-2 coreceptor usage, attaining sensitivities of 73.5% and specificities of 96% during tenfold nested cross validation. The predictive performance of SVMs was not significantly different (p value 0.37) from an existing rules-based approach. Moreover, geno2pheno[coreceptor-hiv2] achieved a predictive accuracy of 100% and outperformed the existing approach on an independent data set containing nine new isolates with corresponding phenotypic measurements of coreceptor usage. geno2pheno[coreceptor-hiv2] could not only reproduce the established markers of CXCR4-usage, but also revealed novel markers: the substitutions 27K, 15G, and 8S were significantly predictive of CXCR4 usage. Furthermore, SVMs trained on the amino-acid sequences of the V1 and V2 loops were also quite accurate in predicting coreceptor usage (AUCs of 0.84 and 0.65, respectively). Conclusions: In this study, we developed geno2pheno[coreceptor-hiv2], the first online tool for the prediction of HIV-2 coreceptor usage from the V3 loop. Using our method, we identified novel amino-acid markers of X4-capable variants in the V3 loop and found that HIV-2 coreceptor usage is also influenced by the V1/V2 region. The tool can aid clinicians in deciding whether coreceptor antagonists such as maraviroc are a treatment option and enables epidemiological studies investigating HIV-2 coreceptor usage. geno2pheno[coreceptor-hiv2] is freely available at http://coreceptor-hiv2.geno2pheno.org
EuCARE-POSTCOVID Study: a multicentre cohort study on long-term post-COVID-19 manifestations
BACKGROUND: Post-COVID-19 condition refers to persistent or new onset symptoms occurring three months after acute COVID-19, which are unrelated to alternative diagnoses. Symptoms include fatigue, breathlessness, palpitations, pain, concentration difficulties ("brain fog"), sleep disorders, and anxiety/depression. The prevalence of post-COVID-19 condition ranges widely across studies, affecting 10-20% of patients and reaching 50-60% in certain cohorts, while the associated risk factors remain poorly understood. METHODS: This multicentre cohort study, both retrospective and prospective, aims to assess the incidence and risk factors of post-COVID-19 condition in a cohort of recovered patients. Secondary objectives include evaluating the association between circulating SARS-CoV-2 variants and the risk of post-COVID-19 condition, as well as assessing long-term residual organ damage (lung, heart, central nervous system, peripheral nervous system) in relation to patient characteristics and virology (variant and viral load during the acute phase). Participants will include hospitalised and outpatient COVID-19 patients diagnosed between 01/03/2020 and 01/02/2025 from 8 participating centres. A control group will consist of hospitalised patients with respiratory infections other than COVID-19 during the same period. Patients will be followed up at the post-COVID-19 clinic of each centre at 2-3, 6-9, and 12-15Â months after clinical recovery. Routine blood exams will be conducted, and patients will complete questionnaires to assess persisting symptoms, fatigue, dyspnoea, quality of life, disability, anxiety and depression, and post-traumatic stress disorders. DISCUSSION: This study aims to understand post-COVID-19 syndrome's incidence and predictors by comparing pandemic waves, utilising retrospective and prospective data. Gender association, especially the potential higher prevalence in females, will be investigated. Symptom tracking via questionnaires and scales will monitor duration and evolution. Questionnaires will also collect data on vaccination, reinfections, and new health issues. Biological samples will enable future studies on post-COVID-19 sequelae mechanisms, including inflammation, immune dysregulation, and viral reservoirs. TRIAL REGISTRATION: This study has been registered with ClinicalTrials.gov under the identifier NCT05531773
OpenMS – An open-source software framework for mass spectrometry
<p>Abstract</p> <p>Background</p> <p>Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.</p> <p>Results</p> <p>We present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.</p> <p>Conclusion</p> <p>OpenMS is available under the Lesser GNU Public License (LGPL) from the project website at <url>http://www.openms.de</url>.</p
- …