Search CORE

2,029 research outputs found

NN approach and its comparison with NN-SVM to beta-barrel prediction

Author: Grimaldi Cedric Maxime
Kazemian Hassan
White Kenneth
Yusuf Syed A.
Publication venue: 'Elsevier BV'
Publication date: 01/11/2016
Field of study

This paper is concerned with applications of a dual Neural Network (NN) and Support Vector Machine (SVM) to prediction and analysis of beta barrel trans membrane proteins. The prediction and analysis of beta barrel proteins usually offer a host of challenges to the research community, because of their low presence in genomes. Current beta barrel prediction methodologies present intermittent misclassifications resulting in mismatch in the number of membrane spanning regions within amino-acid sequences. To address the problem, this research embarks upon a NN technique and its comparison with hybrid- two-level NN-SVM methodology to classify inter-class and intra-class transitions to predict the number and range of beta membrane spanning regions. The methodology utilizes a sliding-window-based feature extraction to train two different class transitions entitled symmetric and asymmetric models. In symmet- ric modelling, the NN and SVM frameworks train for sliding window over the same intra-class areas such as inner-to-inner, membrane(beta)-to-membrane and outer-to-outer. In contrast, the asymmetric transi- tion trains a NN-SVM classifier for inter-class transition such as outer-to-membrane (beta) and membrane (beta)-to-inner, inner-to-membrane and membrane-to-outer. For the NN and NN-SVM to generate robust outcomes, the prediction methodologies are analysed by jack-knife tests and single protein tests. The computer simulation results demonstrate a significant impact and a superior performance of NN-SVM tests with a 5 residue overlap for signal protein over NN with and without redundant proteins for pre- diction of trans membrane beta barrel spanning regions

London Met Repository

Crossref

Visual and computational analysis of structure-activity relationships in high-throughput screening data

Author: Agrafiotis
Agrafiotis
Ahlberg
Ajay
Ajay
Bayada
Bemis
Bernard
Bonabeau
Brown
Calvert
Card
Chen
Chen
Cho
Christianini
Clark
Clark
Cox
Duda
Edwards
Engels
Frimurer
Gao
Garrido
Ghose
Gillet
Gillet
Hand
Hann
Haupts
Hayward
Hertzberg
Izrailev
Jiang
Jones-Hertzog
Kirew
Kobayashi
Kohonen
Ladd
Lee
Lepre
Martin
Mason
Mello
Meyer
Miller
Mitchell
Oprea
Peter Gedeck
Peter Willett
Poroikov
Rhodes
Roberts
Roberts
Ros
Rusinko
Sadowski
Sadowski
Scherf
Sheridan
Shi
Stanton
Su
Teague
Thompson
Tropsha
Tufte
Tufte
Wagener
Walters
Wang
Wedin
Xie
Xu
Zupan
Publication venue: 'Elsevier BV'
Publication date: 01/08/2001
Field of study

Novel analytic methods are required to assimilate the large volumes of structural and bioassay data generated by combinatorial chemistry and high-throughput screening programmes in the pharmaceutical and agrochemical industries. This paper reviews recent work in visualisation and data mining that can be used to develop structure-activity relationships from such chemical/biological datasets

Crossref

White Rose Research Online

BERTDom: Protein Domain Boundary Prediction Using BERT

Author: Bashir Maryam
Haseeb Ahmad
Wali Aamir
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 31/08/2023
Field of study

The domains of a protein provide an insight on the functions that the protein can perform. Delineation of proteins using high-throughput experimental methods is difficult and a time-consuming task. Template-free and sequence-based computational methods that mainly rely on machine learning techniques can be used. However, some of the drawbacks of computational methods are low accuracy and their limitation in predicting different types of multi-domain proteins. Biological language modeling and deep learning techniques can be useful in such situations. In this study, we propose BERTDom for segmenting protein sequences. BERTDOM uses BERT for feature representation and stacked bi-directional long short term memory for classification. We pre-train BERT from scratch on a corpus of protein sequences obtained from UniProt knowledge base with reference clusters. For comparison, we also used two other deep learning architectures: LSTM and feed-forward neural networks. We also experimented with protein-to-vector (Pro2Vec) feature representation that uses word2vec to encode protein bio-words. For testing, three other bench-marked datasets were used. The experimental results on benchmarks datasets show that BERTDom produces the best F-score as compared to other template-based and template-free protein domain boundary prediction methods. Employing deep learning architectures can significantly improve domain boundary prediction. Furthermore, BERT used extensively in NLP for feature representation, has shown promising results when used for encoding bio-words. The code is available at https://github.com/maryam988/BERTDom-Code

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

CESE-2019

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

This book is a collation of articles published in the Special Issue "CESE-2019: Applications of Membranes" in the journal Sustainability. It contains a wide variety of topics such as the removal of trace organic contaminants using combined direct contact membrane distillation–UV photolysis; evaluating the feasibility of forward osmosis in diluting reverse osmosis concentrate; tailoring the effects of titanium dioxide (TiO2) and polyvinyl alcohol (PVA) in the separation and antifouling performance of thin-film composite polyvinylidene fluoride (PVDF) membrane; enhancing the antibacterial properties of PVDF membrane by surface modification using TiO2 and silver nanoparticles; and reviews on membrane fouling in membrane bioreactor (MBR) systems and recent advances in the prediction of fouling in MBRs. The book is suitable for postgraduate students and researchers working in the field of membrane applications for treating aqueous solutions

Directory of Open Access Books (DOAB)

Biological investigation and predictive modelling of foaming in anaerobic digester

Author: Kanu Ifeyinwa Rita
Publication venue: Energy, Geoscience, Infrastructure and Society
Publication date: 01/05/2018
Field of study

Anaerobic digestion (AD) of waste has been identified as a leading technology for greener renewable energy generation as an alternative to fossil fuel. AD will reduce waste through biochemical processes, converting it to biogas which could be used as a source of renewable energy and the residue bio-solids utilised in enriching the soil. A problem with AD though is with its foaming and the associated biogas loss. Tackling this problem effectively requires identifying and effectively controlling factors that trigger and promote foaming. In this research, laboratory experiments were initially carried out to differentiate foaming causal and exacerbating factors. Then the impact of the identified causal factors (organic loading rate-OLR and volatile fatty acid-VFA) on foaming occurrence were monitored and recorded. Further analysis of foaming and nonfoaming sludge samples by metabolomics techniques confirmed that the OLR and VFA are the prime causes of foaming occurrence in AD. In addition, the metagenomics analysis showed that the phylum bacteroidetes and proteobacteria were found to be predominant with a higher relative abundance of 30% and 29% respectively while the phylum actinobacteria representing the most prominent filamentous foam causing bacteria such as Norcadia amarae and Microthrix Parvicella had a very low and consistent relative abundance of 0.9% indicating that the foaming occurrence in the AD studied was not triggered by the presence of filamentous bacteria. Consequently, data driven models to predict foam formation were developed based on experimental data with inputs (OLR and VFA in the feed) and output (foaming occurrence). The models were extensively validated and assessed based on the mean squared error (MSE), root mean squared error (RMSE), R2 and mean absolute error (MAE). Levenberg Marquadt neural network model proved to be the best model for foaming prediction in AD, with RMSE = 5.49, MSE = 30.19 and R2 = 0.9435. The significance of this study is the development of a parsimonious and effective modelling tool that enable AD operators to proactively avert foaming occurrence, as the two model input variables (OLR and VFA) can be easily adjustable through simple programmable logic controller

ROS: The Research Output Service. Heriot-Watt University Edinburgh

Processing hidden Markov models using recurrent neural networks for biological applications

Author: Rallabandi Pavan Kumar
Publication venue: 'University of the Western Cape Library Service'
Publication date: 01/01/2013
Field of study

Philosophiae Doctor - PhDIn this thesis, we present a novel hybrid architecture by combining the most popular sequence recognition models such as Recurrent Neural Networks (RNNs) and Hidden Markov Models (HMMs). Though sequence recognition problems could be potentially modelled through well trained HMMs, they could not provide a reasonable solution to the complicated recognition problems. In contrast, the ability of RNNs to recognize the complex sequence recognition problems is known to be exceptionally good. It should be noted that in the past, methods for applying HMMs into RNNs have been developed by other researchers. However, to the best of our knowledge, no algorithm for processing HMMs through learning has been given. Taking advantage of the structural similarities of the architectural dynamics of the RNNs and HMMs, in this work we analyze the combination of these two systems into the hybrid architecture. To this end, the main objective of this study is to improve the sequence recognition/classi_cation performance by applying a hybrid neural/symbolic approach. In particular, trained HMMs are used as the initial symbolic domain theory and directly encoded into appropriate RNN architecture, meaning that the prior knowledge is processed through the training of RNNs. Proposed algorithm is then implemented on sample test beds and other real time biological applications

UWC Theses and Dissertations

Improved general regression network for protein domain boundary prediction

Author: A Ceroni
A Vieira
Abdur R Sikder
AK Jain
Albert Y Zomaya
AR Sikder
AR Sikder
Bing Bing Zhou
C Chothia
C Civera
CC Lee
CR Robinson
DB Wetlaufer
FMG Pearl
G Pollastri
G Pollastri
HC Van Leeuwen
HM Berman
J Chen
J Cheng
J Liu
J Sim
JCB Melo
JE Gewehr
JS Richardson
JSR Jang
M Dumontier
M Dumontier
M Suyama
MJ Lehtinen
N Nagarajan
OV Galzitskaya
P Baldi
P Bork
Paul D Yoo
RA George
RE Schapire
RL Marsden
RR Copley
RR Joshi
RS Gokhale
S Prompramote
S Veretnik
SF Altschul
TA Holland
Y Freund
Publication venue: BioMed Central
Publication date: 13/02/2008
Field of study

Background: Protein domains present some of the most useful information that can be used to understand protein structure and functions. Recent research on protein domain boundary prediction has been mainly based on widely known machine learning techniques, such as Artificial Neural Networks and Support Vector Machines. In this study, we propose a new machine learning model (IGRN) that can achieve accurate and reliable classification, with significantly reduced computations. The IGRN was trained using a PSSM (Position Specific Scoring Matrix), secondary structure, solvent accessibility information and inter-domain linker index to detect possible domain boundaries for a target sequence. Results: The proposed model achieved average prediction accuracy of 67% on the Benchmark_2 dataset for domain boundary identification in multi-domains proteins and showed superior predictive performance and generalisation ability among the most widely used neural network models. With the CASP7 benchmark dataset, it also demonstrated comparable performance to existing domain boundary predictors such as DOMpro, DomPred, DomSSEA, DomCut and DomainDiscovery with 70.10% prediction accuracy. Conclusion: The performance of proposed model has been compared favourably to the performance of other existing machine learning based methods as well as widely known domain boundary predictors on two benchmark datasets and excels in the identification of domain boundaries in terms of model bias, generalisation and computational requirements. © 2008 Yoo et al; licensee BioMed Central Ltd

Crossref

Michigan Technological University

PubMed Central

Artificial intelligence methods enhance the discovery of RNA interactions

Author: Appierdo R
Ballesio F
Carrino C
Gherardini P F
Helmer Citterich M
Pepe G
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

Understanding how RNAs interact with proteins, RNAs, or other molecules remains a challenge of main interest in biology, given the importance of these complexes in both normal and pathological cellular processes. Since experimental datasets are starting to be available for hundreds of functional interactions between RNAs and other biomolecules, several machine learning and deep learning algorithms have been proposed for predicting RNA-RNA or RNA-protein interactions. However, most of these approaches were evaluated on a single dataset, making performance comparisons difficult. With this review, we aim to summarize recent computational methods, developed in this broad research area, highlighting feature encoding and machine learning strategies adopted. Given the magnitude of the effect that dataset size and quality have on performance, we explored the characteristics of these datasets. Additionally, we discuss multiple approaches to generate datasets of negative examples for training. Finally, we describe the best-performing methods to predict interactions between proteins and specific classes of RNA molecules, such as circular RNAs (circRNAs) and long non-coding RNAs (lncRNAs), and methods to predict RNA-RNA or RNA-RBP interactions independently of the RNA type

PubMed Central

ART