52 research outputs found
Computational identification of Penaeus monodon microRNA genes and their targets
MicroRNAs (miRNAs) are a distinct class of small non-coding RNAs, ~22 nt long, found in a wide variety of organisms.They play important regulatory roles by silencing gene activities at the post-transcriptional level. In this work, we developeda computational workflow to identify conserved miRNA genes in the 10,536 unique Penaeus monodon expressed sequencetags (ESTs). After removing all simple repeats and coding regions in the ESTs, the workflow uses both the conservationof miRNA sequences and several filters obtained from pre-miRNA secondary structure properties to identify conservedmiRNAs. Finally, we discovered six potential conserved miRNA genes such as mir-4152, mir-466k, miR-32*, lin-4, mir-1346 andmir-4310
DNPTrapper: an assembly editing tool for finishing and analysis of complex repeat regions
BACKGROUND: Many genome projects are left unfinished due to complex, repeated regions. Finishing is the most time consuming step in sequencing and current finishing tools are not designed with particular attention to the repeat problem. RESULTS: We have developed DNPTrapper, a shotgun sequence finishing tool, specifically designed to address the problems posed by the presence of repeated regions in the target sequence. The program detects and visualizes single base differences between nearly identical repeat copies, and offers the overview and flexibility needed to rapidly resolve complex regions within a working session. The use of a database allows large amounts of data to be stored and handled, and allows viewing of mammalian size genomes. The program is available under an Open Source license. CONCLUSION: With DNPTrapper, it is possible to separate repeated regions that previously were considered impossible to resolve, and finishing tasks that previously took days or weeks can be resolved within hours or even minutes
Database of Trypanosoma cruzi repeated genes: 20 000 additional gene variants
<p>Abstract</p> <p>Background</p> <p>Repeats are present in all genomes, and often have important functions. However, in large genome sequencing projects, many repetitive regions remain uncharacterized. The genome of the protozoan parasite <it>Trypanosoma cruzi </it>consists of more than 50% repeats. These repeats include surface molecule genes, and several other gene families. In the <it>T. cruzi </it>genome sequencing project, it was clear that not all copies of repetitive genes were present in the assembly, due to collapse of nearly identical repeats. However, at the time of publication of the <it>T. cruzi </it>genome, it was not clear to what extent this had occurred.</p> <p>Results</p> <p>We have developed a pipeline to estimate the genomic repeat content, where shotgun reads are aligned to the genomic sequence and the gene copy number is estimated using the average shotgun coverage. This method was applied to the genome of <it>T. cruzi </it>and copy numbers of all protein coding sequences and pseudogenes were estimated. The 22 640 results were stored in a database available online. 18% of all protein coding sequences and pseudogenes were estimated to exist in 14 or more copies in the <it>T. cruzi </it>CL Brener genome. The average coverage of the annotated protein coding sequences and pseudogenes indicate a total gene copy number, including allelic gene variants, of over 40 000.</p> <p>Conclusion</p> <p>Our results indicate that the number of protein coding sequences and pseudogenes in the <it>T. cruzi </it>genome may be twice the previous estimate. We have constructed a database of the <it>T. cruzi </it>gene repeat data that is available as a resource to the community. The main purpose of the database is to enable biologists interested in repeated, unfinished regions to closely examine and resolve these regions themselves using all available shotgun data, instead of having to rely on annotated consensus sequences that often are erroneous and possibly misleading. Five repetitive genes were studied in more detail, in order to illustrate how the database can be used to analyze and extract information about gene repeats with different characteristics in <it>Trypanosoma cruzi</it>.</p
Recovery of acidified lakes in Finland and subsequent responses of perch and roach populations
Finnish-lake and fish-status surveys indicated that 4900 small headwater lakes suffered from acidic deposition and 1600–3200 roach (Rutilus rutilus) and perch (Perca fluviatilis) populations were affected or extinct by the end of 1980s. Since the late 1980s, successful sulphur emission reductions in Europe have induced a chemical recovery of acidified lakes. This resulted in decreases in sulphate and labile aluminium concentrations and increases in pH and alkalinity during the 1990s. The first signs of recovery in affected perch populations were observed in the early 1990s. New strong year-classes appeared and the population structure returned to normal. Little if any recovery of the affected populations of the more acid-sensitive species, roach, was recorded. This may have been due to still critical water quality conditions for successful reproduction of sensitive roach and/or organic acid episodes in the 2000s, suppressing the recovery of buffering capacity.Peer reviewe
Toll-Like Receptor 4 Promoter Polymorphisms: Common TLR4 Variants May Protect against Severe Urinary Tract Infection
10.1371/journal.pone.0010734PLoS ONE55
CNV-seq, a new method to detect copy number variation using high-throughput sequencing
<p>Abstract</p> <p>Background</p> <p>DNA copy number variation (CNV) has been recognized as an important source of genetic variation. Array comparative genomic hybridization (aCGH) is commonly used for CNV detection, but the microarray platform has a number of inherent limitations.</p> <p>Results</p> <p>Here, we describe a method to detect copy number variation using shotgun sequencing, CNV-seq. The method is based on a robust statistical model that describes the complete analysis procedure and allows the computation of essential confidence values for detection of CNV. Our results show that the number of reads, not the length of the reads is the key factor determining the resolution of detection. This favors the next-generation sequencing methods that rapidly produce large amount of short reads.</p> <p>Conclusion</p> <p>Simulation of various sequencing methods with coverage between 0.1× to 8× show overall specificity between 91.7 – 99.9%, and sensitivity between 72.2 – 96.5%. We also show the results for assessment of CNV between two individual human genomes.</p
Establishing bioinformatics research in the Asia Pacific
In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5(th )annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand) and Busan (South Korea). This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community
MicroTar: predicting microRNA targets from RNA duplexes
BACKGROUND: The accurate prediction of a comprehensive set of messenger RNAs (targets) regulated by animal microRNAs (miRNAs) remains an open problem. In particular, the prediction of targets that do not possess evolutionarily conserved complementarity to their miRNA regulators is not adequately addressed by current tools. RESULTS: We have developed MicroTar, an animal miRNA target prediction tool based on miRNA-target complementarity and thermodynamic data. The algorithm uses predicted free energies of unbound mRNA and putative mRNA-miRNA heterodimers, implicitly addressing the accessibility of the mRNA 3' untranslated region. MicroTar does not rely on evolutionary conservation to discern functional targets, and is able to predict both conserved and non-conserved targets. MicroTar source code and predictions are accessible at , where both serial and parallel versions of the program can be downloaded under an open-source licence. CONCLUSION: MicroTar achieves better sensitivity than previously reported predictions when tested on three distinct datasets of experimentally-verified miRNA-target interactions in C. elegans, Drosophila, and mouse
- …