2,333 research outputs found

    Analysis of Genomic and Proteomic Signals Using Signal Processing and Soft Computing Techniques

    Get PDF
    Bioinformatics is a data rich field which provides unique opportunities to use computational techniques to understand and organize information associated with biomolecules such as DNA, RNA, and Proteins. It involves in-depth study in the areas of genomics and proteomics and requires techniques from computer science,statistics and engineering to identify, model, extract features and to process data for analysis and interpretation of results in a biologically meaningful manner.In engineering methods the signal processing techniques such as transformation,filtering, pattern analysis and soft-computing techniques like multi layer perceptron(MLP) and radial basis function neural network (RBFNN) play vital role to effectively resolve many challenging issues associated with genomics and proteomics. In this dissertation, a sincere attempt has been made to investigate on some challenging problems of bioinformatics by employing some efficient signal and soft computing methods. Some of the specific issues, which have been attempted are protein coding region identification in DNA sequence, hot spot identification in protein, prediction of protein structural class and classification of microarray gene expression data. The dissertation presents some novel methods to measure and to extract features from the genomic sequences using time-frequency analysis and machine intelligence techniques.The problems investigated and the contribution made in the thesis are presented here in a concise manner. The S-transform, a powerful time-frequency representation technique, possesses superior property over the wavelet transform and short time Fourier transform as the exponential function is fixed with respect to time axis while the localizing scalable Gaussian window dilates and translates. The S-transform uses an analysis window whose width is decreasing with frequency providing a frequency dependent resolution. The invertible property of S-transform makes it suitable for time-band filtering application. Gene prediction and protein coding region identification have been always a challenging task in computational biology,especially in eukaryote genomes due to its complex structure. This issue is resolved using a S-transform based time-band filtering approach by localizing the period-3 property present in the DNA sequence which forms the basis for the identification.Similarly, hot spot identification in protein is a burning issue in protein science due to its importance in binding and interaction between proteins. A novel S-transform based time-frequency filtering approach is proposed for efficient identification of the hot spots. Prediction of structural class of protein has been a challenging problem in bioinformatics.A novel feature representation scheme is proposed to efficiently represent the protein, thereby improves the prediction accuracy. The high dimension and low sample size of microarray data lead to curse of dimensionality problem which affects the classification performance.In this dissertation an efficient hybrid feature extraction method is proposed to overcome the dimensionality issue and a RBFNN is introduced to efficiently classify the microarray samples

    Determination of Characteristic Frequency for Identification of Hot Spots in Proteins

    Get PDF
    Identification of hot spots or protein-target binding sites in proteins using resonant recognition model requires the knowledge of characteristic frequency. For a successful protein target interaction, both the protein and the target signals must share the same characteristic frequency. The common characteristic frequency of a functional group of proteins is determined from the consensus spectrum obtained using DFT. In this work an alternative approach for identification of characteristic frequency using power spectral density is described. The performance of the proposed method is observed to be better than the DFT-based approach and is illustrated using simulation examples

    Method and System for Identification of Metabolites Using Mass Spectra

    Get PDF
    A method and system is provided for mass spectrometry for identification of a specific elemental formula for an unknown compound which includes but is not limited to a metabolite. The method includes calculating a natural abundance probability (NAP) of a given isotopologue for isotopes of non-labelling elements of an unknown compound. Molecular fragments for a subset of isotopes identified using the NAP are created and sorted into a requisite cache data structure to be subsequently searched. Peaks from raw spectrum data from mass spectrometry for an unknown compound. Sample-specific peaks of the unknown com- pound from various spectral artifacts in ultra-high resolution Fourier transform mass spectra are separated. A set of possible isotope-resolved molecular formula (IMF) are created by iteratively searching the molecular fragment caches and combining with additional isotopes and then statistically filtering the results based on NAP and mass-to-charge (m/2) matching probabilities. An unknown compound is identified and its corresponding elemental molecular formula (EMF) from statistically-significant caches of isotopologues with compatible IMFs

    Improving the accuracy and efficiency of docking methods

    Get PDF
    Computational methods for predicting macromolecular complexes are useful tools for studying biological systems. They are used in areas such as drug design and for studying protein-protein interactions. While considerable progress has been made in this field over the decades, enhancing the speed and accuracy of these computational methods remains an important challenge. This work describes two different enhancements to the accuracy of ClusPro, a method for performing protein-protein docking, as well as an enhancement to the efficiency of global rigid body docking. SAXS is a high throughput technique collected for molecules in solution, and the data provides information about the shape and size of molecules. ClusPro was enhanced with the ability to SAXS data collected for protein complexes to guide docking by selecting conformations by how well they match the experimental data, which improved docking accuracy when such data is available. Various other experimental techniques, such as NMR, FRET, or chemical cross linking can provide information about protein-protein interfaces, and such information can be used to generate distance-based restraints between pairs of residues across the interface. A second enhancement to ClusPro enables the use of such distance restraints to improve docking accuracy. Finally, an enhancement to the efficiency of FFT based global docking programs was developed. This enhancement allows for the efficient search of multiple sidechain conformations, and this improved program was applied to the flexible computational solvent mapping program FTFlex.2018-07-09T00:00:00

    Unraveling the Thousand Word Picture: An Introduction to Super-Resolution Data Analysis

    Get PDF
    Super-resolution microscopy provides direct insight into fundamental biological processes occurring at length scales smaller than light’s diffraction limit. The analysis of data at such scales has brought statistical and machine learning methods into the mainstream. Here we provide a survey of data analysis methods starting from an overview of basic statistical techniques underlying the analysis of super-resolution and, more broadly, imaging data. We subsequently break down the analysis of super-resolution data into four problems: the localization problem, the counting problem, the linking problem, and what we’ve termed the interpretation problem

    Image Processing and Simulation Toolboxes of Microscopy Images of Bacterial Cells

    Get PDF
    Recent advances in microscopy imaging technology have allowed the characterization of the dynamics of cellular processes at the single-cell and single-molecule level. Particularly in bacterial cell studies, and using the E. coli as a case study, these techniques have been used to detect and track internal cell structures such as the Nucleoid and the Cell Wall and fluorescently tagged molecular aggregates such as FtsZ proteins, Min system proteins, inclusion bodies and all the different types of RNA molecules. These studies have been performed with using multi-modal, multi-process, time-lapse microscopy, producing both morphological and functional images. To facilitate the finding of relationships between cellular processes, from small-scale, such as gene expression, to large-scale, such as cell division, an image processing toolbox was implemented with several automatic and/or manual features such as, cell segmentation and tracking, intra-modal and intra-modal image registration, as well as the detection, counting and characterization of several cellular components. Two segmentation algorithms of cellular component were implemented, the first one based on the Gaussian Distribution and the second based on Thresholding and morphological structuring functions. These algorithms were used to perform the segmentation of Nucleoids and to identify the different stages of FtsZ Ring formation (allied with the use of machine learning algorithms), which allowed to understand how the temperature influences the physical properties of the Nucleoid and correlated those properties with the exclusion of protein aggregates from the center of the cell. Another study used the segmentation algorithms to study how the temperature affects the formation of the FtsZ Ring. The validation of the developed image processing methods and techniques has been based on benchmark databases manually produced and curated by experts. When dealing with thousands of cells and hundreds of images, these manually generated datasets can become the biggest cost in a research project. To expedite these studies in terms of time and lower the cost of the manual labour, an image simulation was implemented to generate realistic artificial images. The proposed image simulation toolbox can generate biologically inspired objects that mimic the spatial and temporal organization of bacterial cells and their processes, such as cell growth and division and cell motility, and cell morphology (shape, size and cluster organization). The image simulation toolbox was shown to be useful in the validation of three cell tracking algorithms: Simple Nearest-Neighbour, Nearest-Neighbour with Morphology and DBSCAN cluster identification algorithm. It was shown that the Simple Nearest-Neighbour still performed with great reliability when simulating objects with small velocities, while the other algorithms performed better for higher velocities and when there were larger clusters present

    A robust algorithm for segmenting fluorescence images and its application to single-molecule counting

    Get PDF
    La microscopie par fluorescence de cellules vivantes produit de grandes quantités de données. Ces données sont composées d’une grande diversité au niveau de la forme des objets d’intérêts et possèdent un ratio signaux/bruit très bas. Pour concevoir un pipeline d’algorithmes efficaces en traitement d’image de microscopie par fluorescence, il est important d’avoir une segmentation robuste et fiable étant donné que celle-ci constitue l’étape initiale du traitement d’image. Dans ce mémoire, je présente MinSeg, un algorithme de segmentation d’image de microscopie par fluorescence qui fait peu d’assomptions sur l’image et utilise des propriétés statistiques pour distinguer le signal par rapport au bruit. MinSeg ne fait pas d’assomption sur la taille ou la forme des objets contenus dans l’image. Par ce fait, il est donc applicable sur une grande variété d’images. Je présente aussi une suite d’algorithmes pour la quantification de petits complexes dans des expériences de microscopie par fluorescence de molécules simples utilisant l’algorithme de segmentation MinSeg. Cette suite d’algorithmes a été utilisée pour la quantification d’une protéine nommée CENP-A qui est une variante de l’histone H3. Par cette technique, nous avons trouvé que CENP-A est principalement présente sous forme de dimère.Live-cell fluorescence microscopy produces high amounts of data with a high variability in shapes at low signal-to-noise ratio. An efficient design of image analysis pipelines requires a reliable and robust initial segmentation step that needs little parameter fine-tuning. Here, I present a segmentation algorithm called MinSeg for fluorescence image data that relies on minimal assumptions about the image, and uses statistical considerations to distinguish signal from background. More importantly, the algorithm does not make assumptions about feature size or shape, and is thus universally applicable. I also present a pipeline for the quantification of small complexes with single-molecule fluorescence microscopy using this segmentation algorithm as the first step of the workflow. This pipeline was used for the quantification of a small histone H3 variant protein called CENP-A. We found that the CENP-A nucleosomes are dimers

    Plasmonic Nanoplatforms for Biochemical Sensing and Medical Applications

    Get PDF
    Plasmonics, the science of the excitation of surface plasmon polaritons (SPP) at the metal-dielectric interface under intense beam radiation, has been studied for its immense potential for developing numerous nanophotonic devices, optical circuits and lab-on-a-chip devices. The key feature, which makes the plasmonic structures promising is the ability to support strong resonances with different behaviors and tunable localized hotspots, excitable in a wide spectral range. Therefore, the fundamental understanding of light-matter interactions at subwavelength nanostructures and use of this understanding to tailor plasmonic nanostructures with the ability to sustain high-quality tunable resonant modes are essential toward the realization of highly functional devices with a wide range of applications from sensing to switching. We investigated the excitation of various plasmonic resonance modes (i.e. Fano resonances, and toroidal moments) using both optical and terahertz (THz) plasmonic metamolecules. By designing and fabricating various nanostructures, we successfully predicted, demonstrated and analyzed the excitation of plasmonic resonances, numerically and experimentally. A simple comparison between the sensitivity and lineshape quality of various optically driven resonances reveals that nonradiative toroidal moments are exotic plasmonic modes with strong sensitivity to environmental perturbations. Employing toroidal plasmonic metasurfaces, we demonstrated ultrafast plasmonic switches and highly sensitive sensors. Focusing on the biomedical applications of toroidal moments, we developed plasmonic metamaterials for fast and cost-effective infection diagnosis using the THz range of the spectrum. We used the exotic behavior of toroidal moments for the identification of Zika-virus (ZIKV) envelope proteins as the infectious nano-agents through two protocols: 1) direct biding of targeted biomarkers to the plasmonic metasurfaces, and 2) attaching gold nanoparticles to the plasmonic metasurfaces and binding the proteins to the particles to enhance the sensitivity. This led to developing ultrasensitive THz plasmonic metasensors for detection of nanoscale and low-molecular-weight biomarkers at the picomolar range of concentration. In summary, by using high-quality and pronounced toroidal moments as sensitive resonances, we have successfully designed, fabricated and characterized novel plasmonic toroidal metamaterials for the detection of infectious biomarkers using different methods. The proposed approach allowed us to compare and analyze the binding properties, sensitivity, repeatability, and limit of detection of the metasensing device
    corecore