Search CORE

3,929 research outputs found

Application of the Multi-modal Relevance Vector Machine to the Problem of Protein Secondary Structure Prediction

Author: B. Rost
C. Branden
D. Engel
J. Ward
L. Wang
M. Dayhoff
P. Aloy
P. Yoo
R. Duin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The aim of the paper is to experimentally examine the plausibility of Relevance Vector Machines (RVM) for protein secondary structure prediction. We restrict our attention to detecting strands which represent an especially problematic element of the secondary structure. The commonly adopted local principle of secondary structure prediction is applied, which implies comparison of a sliding window in the given polypeptide chain with a number of reference amino-acid sequences cut out of the training proteins as benchmarks representing the classes of secondary structure. As distinct from the classical RVM, the novel version applied in this paper allows for selective combination of several tentative window comparison modalities. Experiments on the RS126 data set have shown its ability to essentially decrease the number of reference fragments in the resulting decision rule and to select a subset of the most appropriate comparison modalities within the given set of the tentative ones. © 2012 Springer-Verlag

Crossref

Surrey Research Insight

The promises of large language models for protein design and modeling.

Author: Cabri Alberto
Casiraghi Elena
Gliozzo Jessica
Malchiodi Dario
Mesiti Marco
Reese Justin
Robinson Peter N
Soto-Gomez Mauricio
Valentini Giorgio
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2023
Field of study

The recent breakthroughs of Large Language Models (LLMs) in the context of natural language processing have opened the way to significant advances in protein research. Indeed, the relationships between human natural language and the language of proteins invite the application and adaptation of LLMs to protein modelling and design. Considering the impressive results of GPT-4 and other recently developed LLMs in processing, generating and translating human languages, we anticipate analogous results with the language of proteins. Indeed, protein language models have been already trained to accurately predict protein properties, generate novel functionally characterized proteins, achieving state-of-the-art results. In this paper we discuss the promises and the open challenges raised by this novel and exciting research area, and we propose our perspective on how LLMs will affect protein modeling and design

The Jackson Laboratory: The Mouseion at the JAXlibrary

Exploration of machine learning approaches with genome-scale metabolic model-generated fluxes

Author: Magazzu Giuseppe
Publication venue
Publication date: 01/01/2023
Field of study

Teeside University's Research Repository

End-to-end learning framework for circular RNA classiﬁcation from other long non-coding RNAs using multi-modal deep learning.

Author: Chaabane Mohamed
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/05/2018
Field of study

Over the past two decades, a circular form of RNA (circular RNA) produced from splicing mechanism has become the focus of scientiﬁc studies due to its major role as a microRNA (miR) ac tivity modulator and its association with various diseases including cancer. Therefore, the detection of circular RNAs is a vital operation for continued comprehension of their biogenesis and purpose. Prediction of circular RNA can be achieved by ﬁrst distinguishing non-coding RNAs from protein coding gene transcripts, separating short and long non-coding RNAs (lncRNAs), and ﬁnally pre dicting circular RNAs from other lncRNAs. However, available tools to distinguish circular RNAs from other lncRNAs have only reached 80% accuracy due to the diﬃculty of classifying circular RNAs from other lncRNAs. Therefore, the availability of a faster, more accurate machine learning method for the identiﬁcation of circular RNAs, which will take into account the speciﬁc features of circular RNA, is essential in the development of systematic annotation. Here we present an End to-End multimodal deep learning framework, our tool, to classify circular RNA from other lncRNA. It fuses a RCM descriptor, an ACNN-BLSTM sequence descriptor, and a conservation descriptor into high level abstraction descriptors, where the shared representations across diﬀerent modalities are integrated. The experiments show that our tool is not only faster compared to existing tools but also eclipses other tools by an over 12% increase in accuracy. Another interesting result found from analysis of a ACNN-BLSTM sequence descriptor is that circular RNA sequences share the characteristics of the coding sequence

University of Louisville

Imaging biomarkers extraction and classification for Prion disease

Author: dos Santos Canas Liane
Publication venue: UCL (University College London)
Publication date: 28/04/2020
Field of study

Prion diseases are a group of rare neurodegenerative conditions characterised by a high rate of progression and highly heterogeneous phenotypes. Whilst the most common form of prion disease occurs sporadically (sporadic Creutzfeldt-Jakob disease, sCJD), other forms are caused by inheritance of prion protein gene mutations or exposure to prions. To date, there are no accurate imaging biomarkers that can be used to predict the future diagnosis of a subject or to quantify the progression of symptoms over time. Besides, CJD is commonly mistaken for other forms of dementia. Due to the large heterogeneity of phenotypes of prion disease and the lack of a consistent spatial pattern of disease progression, the approaches used to study other types of neurodegenerative diseases are not satisfactory to capture the progression of the human form of prion disease. Using a tailored framework, I extracted quantitative imaging biomarkers for characterisation of patients with Prion diseases. Following the extraction of patient-specific imaging biomarkers from multiple images, I implemented a Gaussian Process approach to correlated symptoms with disease types and stages. The model was used on three different tasks: diagnosis, differential diagnosis and stratification, addressing an unmet need to automatically identify patients with or at risk of developing Prion disease. The work presented in this thesis has been extensively validated in a unique Prion disease cohort, comprising both the inherited and sporadic forms of the disease. The model has shown to be effective in the prediction of this illness. Furthermore, this approach may have used in other disorders with heterogeneous imaging features, being an added value for the understanding of neurodegenerative diseases. Lastly, given the rarity of this disease, I also addressed the issue of missing data and the limitations raised by it. Overall, this work presents progress towards modelling of Prion diseases and which computational methodologies are potentially suitable for its characterisation

UCL Discovery

Multimodal regularised linear models with flux balance analysis for mechanistic integration of omics data

Author: Angione Claudio
Magazzu Giuseppe
Zampieri Guido
Publication venue
Publication date: 01/01/2021
Field of study

Motivation: High-throughput biological data, thanks to technological advances, have become cheaper to collect, leading to the availability of vast amounts of omic data of different types. In parallel, the in silico reconstruction and modeling of metabolic systems is now acknowledged as a key tool to complement experimental data on a large scale. The integration of these model- and data-driven information is therefore emerging as a new challenge in systems biology, with no clear guidance on how to better take advantage of the inherent multisource and multiomic nature of these data types while preserving mechanistic interpretation. Results: Here, we investigate different regularization techniques for high-dimensional data derived from the integration of gene expression profiles with metabolic flux data, extracted from strain-specific metabolic models, to improve cellular growth rate predictions. To this end, we propose ad-hoc extensions of previous regularization frameworks including group, view-specific and principal component regularization and experimentally compare them using data from 1143 Saccharomyces cerevisiae strains. We observe a divergence between methods in terms of regression accuracy and integration effectiveness based on the type of regularization employed. In multiomic regression tasks, when learning from experimental and model-generated omic data, our results demonstrate the competitiveness and ease of interpretation of multimodal regularized linear models compared to data-hungry methods based on neural networks. Availability and implementation: All data, models and code produced in this work are available on GitHub at https://github.com/Angione-Lab/HybridGroupIPFLasso_pc2Lasso. Supplementary information: Supplementary data are available at Bioinformatics online

Crossref

Teeside University's Research Repository

Archivio istituzionale della ricerca - Università di Padova

On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

Author: Antonisse Joey
Azzopardi George
Bennabhaktula Swaroop
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/10/2021
Field of study

Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Developments in the tools and methodologies of synthetic biology.

Author: Freemont P
Kelwick R
MacDonald JT
Webb AJ
Publication venue: 'Frontiers Media SA'
Publication date: 26/11/2014
Field of study

Synthetic biology is principally concerned with the rational design and engineering of biologically based parts, devices, or systems. However, biological systems are generally complex and unpredictable, and are therefore, intrinsically difficult to engineer. In order to address these fundamental challenges, synthetic biology is aiming to unify a body of knowledge from several foundational scientific fields, within the context of a set of engineering principles. This shift in perspective is enabling synthetic biologists to address complexity, such that robust biological systems can be designed, assembled, and tested as part of a biological design cycle. The design cycle takes a forward-design approach in which a biological system is specified, modeled, analyzed, assembled, and its functionality tested. At each stage of the design cycle, an expanding repertoire of tools is being developed. In this review, we highlight several of these tools in terms of their applications and benefits to the synthetic biology community

PubMed Central

Spiral - Imperial College Digital Repository