554 research outputs found
Bioinformàtica
La recerca en biologia no es pot entendre avui sense la computació. A causa, sobretot,
del desenvolupament de les tecnologies genòmiques, la biologia ha passat en molt poc
temps, de ser una ciència en la qual l'esforç humà s'orientava principalment envers l'obtenció
d'unes poques dades, a ser una ciència que genera un volum enorme de dades sense
pràcticament intervenció humana. L'esforç de l'investigador s'ha desplaçat, en conseqüència,
de la producció a l'anàlisi de les dades. I és en aquest desplaçament en què els mètodes
informàtics tenen un paper essencial, tant en la planificació dels experiments com en la
seva execució i, sobretot, en l'emmagatzematge i anàlisi dels resultats. Aquests mètodes
configuren una nova disciplina científica, que anomenem bioinformàtica. En aquest article
repassarem, des d'una perspectiva històrica, els fonaments d'aquesta disciplina, que s'articulen
al voltant del concepte, entès de manera molt genèrica, de alineament i similitud entre
seqüències.Nowadays, research in biology can not be understood without computation. Due to the
development of the genomic technologies, biology has been transformed in a very short
period of time, from being a science in which the human effort was mainly oriented towards
data gathering to being a science that generates a huge volume of data with little
(or no) human intervention. The effort of researchers has, consequently, moved away from
data production towards data analysis. Computational methods play an essential role to
cope with this transformation: in the planning of the experiments, as well as in their execution,
and, especially, in the storage and analysis of their results. These methods configure a new scientific discipline named bioinformatics. In this article we review from a historical
perspective the foundations of this discipline, which articulate around the generic concept
of sequence alignment and similarity
Bioinformática ¿una ciencia sin científicos?
El Proyecto Genoma Humano ha catalizado una presencia sin precedentes de la investigación en biología en los medios de comunicación. Este impacto mediático no es gratuito. El conocimiento de la secuencia de nucleótidos del genoma humano y de la secuencia de aminoácidos de las proteínas codificadas en ese genoma tendrá, se dice, un impacto extraordinario en la medicina, la agricultura y en muchos procesos industriales. Tendrá, en consecuencia, repercursiones económicas, sociales y quizás, incluso, políticas. En definitiva afectará profundamente nuestras vidas y es lógico que despierte nuestro interés. = The Human Genome Project has promoted an unprecedented presence of information on biological research in the media. This is not a gratuitous impact. It is widely believed that the accrued knowledge on human genome nucleotide sequences and on amino acid sequences of proteins codified by our genome will have an exceptional impact on medical sciences, agricultural sciences and many industrial processes. That is, it will cause financial, social and perhaps even political repercussions. In other words, it will deeply affect our lives, and thus is worthy of our interest
Applications of a quantum random number generator to simulations in condense matter physics
We study the importance of the quality of random numbers in Monte Carlo simulations of 2D Ising systems. Simulations are carried out at critical temperature to find the dynamic scaling law of the linear relaxation time. Our aim is to show that statistical correlations that appear in large Ising simulations performed with pseudorandom numbers can be corrected using a quantum random number generator (QRNG). To achieve high speeds and large systems, Ising lattices are simulated on a field programmable gate array (FPGA) with an optical QRNG
In silico meets in vivo
A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics 'In silico Biology: Gene Discovery and Systems Genomics', Atlanta, USA, 15-17 November, 2007
Multiple non-collinear TF-map alignments of promoter regions
<p>Abstract</p> <p>Background</p> <p>The analysis of the promoter sequence of genes with similar expression patterns is a basic tool to annotate common regulatory elements. Multiple sequence alignments are on the basis of most comparative approaches. The characterization of regulatory regions from co-expressed genes at the sequence level, however, does not yield satisfactory results in many occasions as promoter regions of genes sharing similar expression programs often do not show nucleotide sequence conservation.</p> <p>Results</p> <p>In a recent approach to circumvent this limitation, we proposed to align the maps of predicted transcription factors (referred as TF-maps) instead of the nucleotide sequence of two related promoters, taking into account the label of the corresponding factor and the position in the primary sequence. We have now extended the basic algorithm to permit multiple promoter comparisons using the progressive alignment paradigm. In addition, non-collinear conservation blocks might now be identified in the resulting alignments. We have optimized the parameters of the algorithm in a small, but well-characterized collection of human-mouse-chicken-zebrafish orthologous gene promoters.</p> <p>Conclusion</p> <p>Results in this dataset indicate that TF-map alignments are able to detect high-level regulatory conservation at the promoter and the 3'UTR gene regions, which cannot be detected by the typical sequence alignments. Three particular examples are introduced here to illustrate the power of the multiple TF-map alignments to characterize conserved regulatory elements in absence of sequence similarity. We consider this kind of approach can be extremely useful in the future to annotate potential transcription factor binding sites on sets of co-regulated genes from high-throughput expression experiments.</p
SECISaln, a web-based tool for the creation of structure-based alignments of eukaryotic SECIS elements
Summary: Selenoproteins contain the 21st amino acid selenocysteine which is encoded by an inframe UGA codon, usually read as a stop. In eukaryotes, its co-translational recoding requires the presence of an RNA stem–loop structure, the SECIS element in the 3 untranslated region of (UTR) selenoprotein mRNAs. Despite little sequence conservation, SECIS elements share the same overall secondary structure. Until recently, the lack of a significantly high number of selenoprotein mRNA sequences hampered the identification of other potential sequence conservation. In this work, the web-based tool SECISaln provides for the first time an extensive structure-based sequence alignment of SECIS elements resulting from the well-defined secondary structure of the SECIS RNA and the increased size of the eukaryotic selenoproteome. We have used SECISaln to improve our knowledge of SECIS secondary structure and to discover novel, conserved nucleotide positions and we believe it will be a useful tool for the selenoprotein and RNA scientific communities
Mutation patterns of amino acid tandem repeats in the human proteome
BACKGROUND: Amino acid tandem repeats are found in nearly one-fifth of human proteins. Abnormal expansion of these regions is associated with several human disorders. To gain further insight into the mutational mechanisms that operate in this type of sequence, we have analyzed a large number of mutation variants derived from human expressed sequence tags (ESTs). RESULTS: We identified 137 polymorphic variants in 115 different amino acid tandem repeats. Of these, 77 contained amino acid substitutions and 60 contained gaps (expansions or contractions of the repeat unit). The analysis showed that at least about 21% of the repeats might be polymorphic in humans. We compared the mutations found in different types of amino acid repeats and in adjacent regions. Overall, repeats showed a five-fold increase in the number of gap mutations compared to adjacent regions, reflecting the action of slippage within the repetitive structures. Gap and substitution mutations were very differently distributed between different amino acid repeat types. Among repeats containing gap variants we identified several disease and candidate disease genes. CONCLUSION: This is the first report at a genome-wide scale of the types of mutations occurring in the amino acid repeat component of the human proteome. We show that the mutational dynamics of different amino acid repeat types are very diverse. We provide a list of loci with highly variable repeat structures, some of which may be potentially involved in disease
Comparison of splice sites in mammals and chicken.
We have carried out an initial analysis of the dynamics of the recent evolution of the splice-sites sequences on a large collection of human, rodent (mouse and rat), and chicken introns. Our results indicate that the sequences of splice sites are largely homogeneous within tetrapoda. We have also found that orthologous splice signals between human and rodents and within rodents are more conserved than unrelated splice sites, but the additional conservation can be explained mostly by background intron conservation. In contrast, additional conservation over background is detectable in orthologous mammalian and chicken splice sites. Our results also indicate that the U2 and U12 intron classes seem to have evolved independently since the split of mammals and birds; we have not been able to find a convincing case of interconversion between these two classes in our collections of orthologous introns. Similarly, we have not found a single case of switching between AT-AC and GT-AG subtypes within U12 introns, suggesting that this event has been a rare occurrence in recent evolutionary times. Switching between GT-AG and the noncanonical GC-AG U2 subtypes, on the contrary, does not appear to be unusual; in particular, T to C mutations appear to be relatively well tolerated in GT-AG introns with very strong donor sites
- …