Search CORE

165,665 research outputs found

Genomic Signal Processing

Author: Cai Xiaodong
Dougherty Edward R
Huang Yufei
Kim Seungchan
Yamaguchi Rui
Publication venue: Bentham Science Publishers Ltd.
Publication date: 01/09/2009
Field of study

PubMed Central

University of Miami: Scholarship Miami

Genomics and proteomics: a signal processor's tour

Author: Vaidyanathan P. P.
Publication venue
Publication date: 01/12/2004
Field of study

The theory and methods of signal processing are becoming increasingly important in molecular biology. Digital filtering techniques, transform domain methods, and Markov models have played important roles in gene identification, biological sequence analysis, and alignment. This paper contains a brief review of molecular biology, followed by a review of the applications of signal processing theory. This includes the problem of gene finding using digital filtering, and the use of transform domain methods in the study of protein binding spots. The relatively new topic of noncoding genes, and the associated problem of identifying ncRNA buried in DNA sequences are also described. This includes a discussion of hidden Markov models and context free grammars. Several new directions in genomic signal processing are briefly outlined in the end

CiteSeerX

Caltech Authors

Genomic Signal Processing Techniques for Taxonomy Prediction

Author: Can Mehmet
Gursoy Osman
Publication venue: International University of Sarajevo
Publication date: 19/08/2020
Field of study

To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often assigned to operational taxonomic units (OTUs). The abundance of methods that have been used to assign 16S rRNA marker gene sequences into OTUs brings discussions in which one is better. Suggestions on having clustering methods should be stable in which generated OTU assignments do not change as additional sequences are added to the dataset is contradicting some other researches contend that the methods should properly present the distances of sequences is more important. We add one more de novo clustering algorithm, Rolling Snowball to existing ones including the single linkage, complete linkage, average linkage, abundance-based greedy clustering, distance-based greedy clustering, and Swarm and the open and closed-reference methods. We use GreenGenes, RDP, and SILVA 16S rRNA gene databases to show the success of the method. The highest accuracy is obtained with SILVA library

Inquiry (E-Journal - Faculty of Business and Administration, International University of Sarajevo)

Evaluation of Organisms Relationship by Genomic Signal Processing

Author: Škutková Helena
Publication venue: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií
Publication date: 01/01/2016
Field of study

Tato dizertační práce se zabývá alternativními přístupy k analýze genetické informace organismů. V teoretické části práce jsou představeny dva odlišné přístupy vyhodnocení příbuznosti organismů na základě podobnosti jejich genetické informace obsažené v sekvenci DNA. Jedním z nich je dnes standardizovaný postup fylogenetické analýzy znakového zápisu sekvencí DNA. Přestože je tento postup poměrně výpočetně náročný kvůli potřebě mnohonásobného zarovnání DNA sekvencí, umožňuje stanovit podobnost jak globálně celých sekvencí DNA, tak lokalizovat jen konkrétní homologie v nich. Druhým přístupem jsou alternativní techniky klasifikace sekvencí DNA ve formě numerického vektoru reprezentujícího charakteristický rys obsažené genetické informace. Tyto metody označované jako „alignment-free“ umožňují velmi rychlé vyhodnocení globální podobnosti sekvencí DNA, numerickou konverzí však ztrácejí možnost vyhodnotit lokální změny v sekvencích. V praktické části je pak představena nová metoda klasifikace numerických reprezentací DNA kombinující výhody obou uvedených přístupů. Z numerických reprezentací DNA jsou zvoleny jen reprezentace mající 1D signálu podobný charakter, tzn. obsahující specifický trend vyvíjející se podél osy x. Hlavním předpokladem je taxonomická specifičnost těchto genomických signálů. Praktická část práce se zabývá vytvořením vhodných nástrojů pro číslicové zpracování genomických signálů umožňující vyhodnocení vzájemné podobnosti taxonomicky specifických trendů. Na základě vyhodnocené vzájemné podobnosti genomických signálů je provedena klasifikace formou dendrogramu, jež je obdobou fylogenetických stromů využívaných ve standardní fylogenetice.This dissertation deals with alternative techniques for analysis of genetic information of organisms. The theoretical part presents two different approaches for evaluation of relationship between organisms based on mutual similarity of genetic information contained in their DNA sequences. The first approach is currently standardized phylogenetics analysis of character based records of DNA sequences. Although this approach is computationally expensive due to the need of multiple sequence alignment, it allows evaluation of global and local similarity of DNA sequences. The second approach is represented by techniques for classification of DNA sequences in a form of numerical vectors representing characteristic features of their genetic information. These methods known as „alignment free“ allow fast evaluation of global similarity but cannot evaluate local changes. The new method presented in this dissertation combines the advantages of both approaches. It utilizes numerical representation similar to 1D digital signal, i.e. representation that contains specific trend along x-axis. The experimental part of dissertation deals with design of a set of appropriate tools for genomic signal processing to allow evaluation mutual similarity of taxonomically specific trends. On the basis of the mutual similarity of genomic signals, the classification in the form of dendrogram is applied. It corresponds to phylogenetic trees used in standard phylogenetics.

Digital library of Brno University of Technology

National Repository of Grey Literature

Recommended from our members

Topics in Genomic Signal Processing

Author: Jajamovich Guido Hugo
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Genomic information is digital in its nature and admits mathematical modeling in order to gain biological knowledge. This dissertation focuses on the development and application of detection and estimation theories for solving problems in genomics by describing biological problems in mathematical terms and proposing a solution in this domain. More specifically, a novel framework for hypothesis testing is presented, where it is desired to decide among multiple hypotheses and where each hypothesis involves unknown parameters. Within this framework, a test is developed to perform both detection and estimation jointly in an optimal sense. The proposed test is then applied to the problem of detecting and estimating periodicities in DNA sequences. Moreover, the problem of motif discovery in DNA sequences is presented, where a set of sequences is observed and it is needed to determine which sequences contain instances (if any) of an unknown motif and estimate their positions. A statistical description of the problem is used and a sequential Monte Carlo method is applied for the inference. Finally, the phasing of haplotypes for diploid organisms is introduced, where a novel mathematical model is proposed. The haplotypes that are used to reconstruct the observed genotypes of a group of unrelated individuals are detected and the haplotype pair for each individual in the group is estimated. The model translates a biological principle, the maximum parsimony principle, to a sparseness condition

Columbia University Academic Commons

Genomic applications of statistical signal processing

Author: Zhao Wentao
Publication venue
Publication date: 15/05/2009
Field of study

Biological phenomena in the cells can be explained in terms of the interactions among biological macro-molecules, e.g., DNAs, RNAs and proteins. These interactions can be modeled by genetic regulatory networks (GRNs). This dissertation proposes to reverse engineering the GRNs based on heterogeneous biological data sets, including time-series and time-independent gene expressions, Chromatin ImmunoPrecipatation (ChIP) data, gene sequence and motifs and other possible sources of knowledge. The objective of this research is to propose novel computational methods to catch pace with the fast evolving biological databases. Signal processing techniques are exploited to develop computationally efficient, accurate and robust algorithms, which deal individually or collectively with various data sets. Methods of power spectral density estimation are discussed to identify genes participating in various biological processes. Information theoretic methods are applied for non-parametric inference. Bayesian methods are adopted to incorporate several sources with prior knowledge. This work aims to construct an inference system which takes into account different sources of information such that the absence of some components will not interfere with the rest of the system. It has been verified that the proposed algorithms achieve better inference accuracy and higher computational efficiency compared with other state-of-the-art schemes, e.g. REVEAL, ARACNE, Bayesian Networks and Relevance Networks, at presence of artificial time series and steady state microarray measurements. The proposed algorithms are especially appealing when the the sample size is small. Besides, they are able to integrate multiple heterogeneous data sources, e.g. ChIP and sequence data, so that a unified GRN can be inferred. The analysis of biological literature and in silico experiments on real data sets for fruit fly, yeast and human have corroborated part of the inferred GRN. The research has also produced a set of potential control targets for designing gene therapy strategies

Texas A&M Repository

9th IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS)

Author: Atwal Gurinder Singh “Mickey”
Dimitrova Nevenka
Vikalo Haris
Yoon Byung-Jun
Publication venue
Publication date: 10/11/2010
Field of study

Cold Spring Harbor Laboratory Institutional Repository

Recommended from our members

Wavelet-Based Genomic Signal Processing for Centromere Identification and Hypothesis Generation

Author: DiFiazio Stephen Paul
Jacobson Daniel
Joubert Wayne
Macaua-Sanz David
Schmutz Jeremy
Shah Manesh
Sreedasyam Avinash
Tuska Gerald
Weighill Deborah
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2019
Field of study

Various ‘omics data types have been generated for Populus trichocarpa, each providing a layer of information which can be represented as a density signal across a chromosome. We make use of genome sequence data, variants data across a population as well as methylation data across 10 different tissues, combined with wavelet-based signal processing to perform a comprehensive analysis of the signature of the centromere in these different data signals, and successfully identify putative centromeric regions in P. trichocarpa from these signals. Furthermore, using SNP (single nucleotide polymorphism) correlations across a natural population of P. trichocarpa, we find evidence for the co-evolution of the centromeric histone CENH3 with the sequence of the newly identified centromeric regions, and identify a new CENH3 candidate in P. trichocarpa

eScholarship - University of California

The Research Repository @ WVU (West Virginia University)