31 research outputs found

    Protein sequences classification based on weighting scheme

    Get PDF
    We present a new technique to recognize remote protein homologies that rely on combining probabilistic modeling and supervised learning in high-dimensional feature spaces. The main novelty of our technique is the method of constructing feature vectors using Hidden Markov Model and the combination of this representation with a classifier capable of learning in very sparse high-dimensional spaces. Each feature vector records the sensitivity of each protein domain to a previously learned set of sub-sequences (strings). Unlike other previous methods, our method takes in consideration the conserved and non-conserved regions. The system subsequently utilizes Support Vector Machines (SVM) classifiers to learn the boundaries between structural protein classes. Experiments show that this method, which we call the String Weighting Scheme-SVM (SWS-SVM) method, significantly improves on previous methods for the classification of protein domains based on remote homologies. Our method is then compared to five existing homology detection methods

    Cloning and in silico characterization of two signal peptides from Pediococcus pentosaceus and their function for the secretion of heterologous protein in Lactococcus lactis.

    Get PDF
    Fifty signal peptides of Pediococcus pentosaceus were characterized by in silico analysis and, based on the physicochemical analysis, (two potential signal peptides Spk1 and Spk3 were identified). The coding sequences of SP were amplified and fused to the gene coding for green fluorescent protein (GFP) and cloned into Lactococcus lactis pNZ8048 and pMG36e vectors, respectively. Western blot analysis indicated that the GFP proteins were secreted using both heterologous SPs. ELISA showed that the secretion efficiency of GFP using Spk1 (0.64 μg/ml) was similar to using Usp45 (0.62 μg/ml) and Spk3 (0.58 μg/ml)

    Automatic clustering of gene ontology by genetic algorithm

    Get PDF
    Abstract—Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorith

    Development of enzymatic membrane reactor (EMR) for cyclodextrins production

    Get PDF
    This paper investigated on the fouling mechanisms in ultrafiltration membrane during separation of Cyclodextrins from starch and CGTase. The Resistance-In-Series Model was used to identify the responsible hydraulic resistances. The result showed that the weak adsorption fouling resistance (ra1) was the main factor that contributed the rate and extent of flux decline. Moreover the significant organic fouling that is contributed by starch, CDs, CGTase and intermediate by-products in Organic colloids and/or macromolecular revealed that the fouling potential was ra1>rg>rcp>ra2. The overall results indicate that the fouling mechanisms may consist of pore mouth adsorption and subsequently narrowing of the pores as those components (starch and CGTase) are small enough not to be excluded by steric considerations. In the latter stage unreacted starch would be accumulated to form gel/cake layer. The measured flux recovery of enzymatic membrane reactor for CDs production was about 95%

    Determination of fouling mechanisms in enzymatic membrane reactor (EMR) for cyclodextrins production based on Resistance-In Series Model

    Get PDF
    This study investigated the fouling mechanisms in ultrafiltration membrane during separation of cyclodextrins from starch and CGTase. The Resistance-In-Series Model was used to identify the responsible hydraulic resistances. The result showed that the weak adsorption fouling resistance (ra1) was the main factor that contributed the rate and extent of flux decline. Moreover the significant organic fouling that is contributed by starch, CDs, CGTase and intermediate by-products in organic colloids and/or macromolecular revealed that the fouling potential was ra1>rg>rcp>ra2. The overall results indicate that the fouling mechanisms may consist of pore mouth adsorption and subsequently narrowing of the pores as those components (starch and CGTase) are small enough not to be excluded by steric considerations. In the latter stage unreacted starch would be accumulated to form gel/cake layer. The measured flux recovery of enzymatic membrane reactor for CDs production was about 95%

    Overexpression, purification and characterization of the Aspergillus niger endoglucanase, EglA, in Pichia pastoris

    Get PDF
    Cellulases are industrially important hydrolytic enzymes applicable in the bioconversion of cellulosic biomass to simple sugars. In this work, an endoglucanase from Aspergillus niger ATCC 10574, EglA, was expressed in the methylotrophic yeast Pichia pastoris and the properties of the recombinant protein were characterized. The full length cDNA of eglA has been cloned into a pPICZaC expression vector and expressed extracellularly as a ~30 kDa recombinant protein in P. pastoris X-33. Pure EglA displayed optimum activity at 50°C and was stable between 30 and 55°C. The pH stability of this enzyme was shown to be in the range of pH 2+.0 to 7.0 and optimum at pH 4.0. EglA showed the highest affinity toward ß-glucan followed by carboxymethyl cellulose (CMC) with a specific activity of 63.83 and 9.47 U/mg, respectively. Very low or no detectable hydrolysis of cellobiose, laminarin, filter paper and avicel were observed. Metal ions such as Mn 2+, Co 2+, Zn 2+, Mg 2+, Ba 2+, Fe 2+, Ca 2+ and K + showed significant augmentation of endoglucanase activity, with manganese ions causing the highest increase in activity to about 2+.7 fold when compared with the control assay, whereas Pd 2+, Cu 2+, SDS and EDTA showed inhibition of EglA activity

    Effect of Substrate and Enzyme Concentration on Cyclodextrin Production in a Hollow Fibre Membrane Reactor System

    Get PDF
    The batch and continuous production of cyclodextrins (CDs) was assessed by employing an enzymatic membrane reactor (EMR) system. The effects of tapioca starch substrate and cyclodextrin glycosyltransferase (CGTase) concentrations on the yield of CDs were studied. A similar effect on the behaviour of the ultrafiltration membrane in the EMR system (integration system) was also evaluated. The results for the batch process showed that incremental doses of CGTase caused gradual increments in CD yield; however, further addition of CGTase (above 1.0%) showed a 16% reduction in the total CD production. Further incremental in the tapioca starch concentration increased CD concentration (23 g/L). However, addition above 8% w/v resulted in an insignificant yield of CDs. In the case of integration system, tapioca starch feeding rate that is higher than 4.41 g/h caused adverse effects (lower CD yield and membrane flux). In particular at higher tapioca starch feeding rate (5.0 g/h), the hydraulic resistance would reach as high as 1.31 × 1013 m−1. Presumably this phenomenon was due to the unreacted substrates that were adsorbing onto the membrane surface and pores that subsequently led to greater fouling conditions and a severe flux decline. In addition, the weak adsorption (ra1) has been found to be the major fouling mechanism attributed to starch and its by products. Therefore, hydraulic cleaning is highly suggested as the procedure to be used for this EMR system

    Assignment of Protein Sequence to Functional Family Using Neural Network & Dempster- Shafer Theory

    No full text
    Protein classification prediction is an important problem in molecular biology, and one that has attracted a lot of attention. This paper describes an approach to data-driven discovery of sequence motifbased models using neural network classifier based on Dempster-Shafer Theory for assigning protein sequences to functional families. A training set of sequences with unknown functional family is used to capture regularities that are sufficient to assign the sequences to their respective families. A new adaptive pattern classifier based on neural network and Dempster–Shafer theory of evidence developed by Thierry Denoux, 2001, [2] is presented. This method uses reference patterns as items of evidence regarding the class membership of each input pattern under consideration. This evidence is represented by basic belief assignments (BBA’s) and pooled using the Dempster’s rule of combination. This procedure can be implemented in a multilayer neural network with specific architecture consisting of one input layer, two hidden layers and one output layer. The weight vector, the receptive field and the class membership of each prototype are determined by minimizing the mean squared differences between the classifier outputs and target values. Index Terms: functional family, protein sequence, neural networks, Dempster-Shafer theor
    corecore