73 research outputs found

    Basement structure of the United Arab Emirates derived from an analysis of regional gravity and aeromagnetic database

    Get PDF
    Gravity and aeromagnetic data covering the whole territory of the United Arab Emirates (UAE) have been used to evaluate both shallow and deep geological structures, in particular the depth to basement since it is not imaged by seismic data anywhere within the UAE. Thus, the aim has been to map the basement so that its structure can help to assess its control on the distribution of hydrocarbons within the UAE. Power spectrum analysis reveals gravity and magnetic signatures to have some similarities, in having two main density/susceptibility interfaces widely separated in depth such that regional-residual anomaly separation could effectively be undertaken. The upper density/susceptibility interface occurs at a depth of about 1.5 km while the deeper interface varies in depth throughout the UAE. For gravity, this deeper interface is assumed to be due to the combined effect of lateral changes in density structures within the sediments and in depth of basement while for magnetics it is assumed the sediments have negligible susceptibility and the anomalies unrelated to the volcanic/magmatic bodies result from only changes in depth to basement. The power spectrum analysis over the suspect volcanic/magmatic bodies indicates they occur at ~ 5 km depth. The finite tilt-depth and finite local wavenumber methods were used to estimate depth to source and only depths that agree to within 10% of each other were used to generate the depth to basement map. This depth to basement map, to the west of the UAE-Oman Mountains, varies in depth from 5 km to in excess of 15 km depth and is able to structurally account for the location of the shear structures, seen in the residual magnetic data, and the location of the volcanic/magmatic centres relative to a set of elongate N-S to NE-SW trending basement highs. The majority of oilfields in the UAE are located within these basement highs. Therefore, the hydrocarbon distribution in the UAE basin appears to be controlled by the location of the basement ridges

    Basement structure of the United Arab Emirates derived from an analysis of regional gravity and aeromagnetic database

    Get PDF
    Gravity and aeromagnetic data covering the whole territory of the United Arab Emirates (UAE) have been used to evaluate both shallow and deep geological structures, in particular the depth to basement since it is not imaged by seismic data anywhere within the UAE. Thus, the aim has been to map the basement so that its structure can help to assess its control on the distribution of hydrocarbons within the UAE. Power spectrum analysis reveals gravity and magnetic signatures to have some similarities, in having two main density/susceptibility interfaces widely separated in depth such that regional-residual anomaly separation could effectively be undertaken. The upper density/susceptibility interface occurs at a depth of about 1.5 km while the deeper interface varies in depth throughout the UAE. For gravity, this deeper interface is assumed to be due to the combined effect of lateral changes in density structures within the sediments and in depth of basement while for magnetics it is assumed the sediments have negligible susceptibility and the anomalies unrelated to the volcanic/magmatic bodies result from only changes in depth to basement. The power spectrum analysis over the suspect volcanic/magmatic bodies indicates they occur at ~ 5 km depth. The finite tilt-depth and finite local wavenumber methods were used to estimate depth to source and only depths that agree to within 10% of each other were used to generate the depth to basement map. This depth to basement map, to the west of the UAE-Oman Mountains, varies in depth from 5 km to in excess of 15 km depth and is able to structurally account for the location of the shear structures, seen in the residual magnetic data, and the location of the volcanic/magmatic centres relative to a set of elongate N-S to NE-SW trending basement highs. The majority of oilfields in the UAE are located within these basement highs. Therefore, the hydrocarbon distribution in the UAE basin appears to be controlled by the location of the basement ridges

    FastBLAST: Homology Relationships for Millions of Proteins

    Get PDF
    BackgroundAll-versus-all BLAST, which searches for homologous pairs of sequences in a database of proteins, is used to identify potential orthologs, to find new protein families, and to provide rapid access to these homology relationships. As DNA sequencing accelerates and data sets grow, all-versus-all BLAST has become computationally demanding.Methodology/principal findingsWe present FastBLAST, a heuristic replacement for all-versus-all BLAST that relies on alignments of proteins to known families, obtained from tools such as PSI-BLAST and HMMer. FastBLAST avoids most of the work of all-versus-all BLAST by taking advantage of these alignments and by clustering similar sequences. FastBLAST runs in two stages: the first stage identifies additional families and aligns them, and the second stage quickly identifies the homologs of a query sequence, based on the alignments of the families, before generating pairwise alignments. On 6.53 million proteins from the non-redundant Genbank database ("NR"), FastBLAST identifies new families 25 times faster than all-versus-all BLAST. Once the first stage is completed, FastBLAST identifies homologs for the average query in less than 5 seconds (8.6 times faster than BLAST) and gives nearly identical results. For hits above 70 bits, FastBLAST identifies 98% of the top 3,250 hits per query.Conclusions/significanceFastBLAST enables research groups that do not have supercomputers to analyze large protein sequence data sets. FastBLAST is open source software and is available at http://microbesonline.org/fastblast

    Assessing Performance of Orthology Detection Strategies Applied to Eukaryotic Genomes

    Get PDF
    Orthology detection is critically important for accurate functional annotation, and has been widely used to facilitate studies on comparative and evolutionary genomics. Although various methods are now available, there has been no comprehensive analysis of performance, due to the lack of a genomic-scale β€˜gold standard’ orthology dataset. Even in the absence of such datasets, the comparison of results from alternative methodologies contains useful information, as agreement enhances confidence and disagreement indicates possible errors. Latent Class Analysis (LCA) is a statistical technique that can exploit this information to reasonably infer sensitivities and specificities, and is applied here to evaluate the performance of various orthology detection methods on a eukaryotic dataset. Overall, we observe a trade-off between sensitivity and specificity in orthology detection, with BLAST-based methods characterized by high sensitivity, and tree-based methods by high specificity.Β Two algorithms exhibit the best overall balance, with both sensitivity and specificity>80%: INPARANOID identifies orthologs across two species while OrthoMCL clusters orthologs from multiple species. Among methods that permit clustering of ortholog groups spanning multiple genomes, the (automated) OrthoMCL algorithm exhibits better within-group consistency with respect to protein function and domain architecture than the (manually curated) KOG database, and the homolog clustering algorithm TribeMCL as well. By way of using LCA, we are also able to comprehensively assess similarities and statistical dependence between various strategies, and evaluate the effects of parameter settings on performance. In summary, we present a comprehensive evaluation of orthology detection on a divergent set of eukaryotic genomes, thus providing insights and guides for method selection, tuning and development for different applications. Many biological questions have been addressed by multiple tests yielding binary (yes/no) outcomes but no clear definition of truth, making LCA an attractive approach for computational biology

    Comparative Genomics of Mycoplasma: Analysis of Conserved Essential Genes and Diversity of the Pan-Genome

    Get PDF
    Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages

    Transcriptome of Aphanomyces euteiches: New Oomycete Putative Pathogenicity Factors and Metabolic Pathways

    Get PDF
    Aphanomyces euteiches is an oomycete pathogen that causes seedling blight and root rot of legumes, such as alfalfa and pea. The genus Aphanomyces is phylogenically distinct from well-studied oomycetes such as Phytophthora sp., and contains species pathogenic on plants and aquatic animals. To provide the first foray into gene diversity of A. euteiches, two cDNA libraries were constructed using mRNA extracted from mycelium grown in an artificial liquid medium or in contact to plant roots. A unigene set of 7,977 sequences was obtained from 18,864 high-quality expressed sequenced tags (ESTs) and characterized for potential functions. Comparisons with oomycete proteomes revealed major differences between the gene content of A. euteiches and those of Phytophthora species, leading to the identification of biosynthetic pathways absent in Phytophthora, of new putative pathogenicity genes and of expansion of gene families encoding extracellular proteins, notably different classes of proteases. Among the genes specific of A. euteiches are members of a new family of extracellular proteins putatively involved in adhesion, containing up to four protein domains similar to fungal cellulose binding domains. Comparison of A. euteiches sequences with proteomes of fully sequenced eukaryotic pathogens, including fungi, apicomplexa and trypanosomatids, allowed the identification of A. euteiches genes with close orthologs in these microorganisms but absent in other oomycetes sequenced so far, notably transporters and non-ribosomal peptide synthetases, and suggests the presence of a defense mechanism against oxidative stress which was initially characterized in the pathogenic trypanosomatids

    Cryptic Diversity of African Tigerfish (Genus Hydrocynus) Reveals Palaeogeographic Signatures of Linked Neogene Geotectonic Events

    Get PDF
    The geobiotic history of landscapes can exhibit controls by tectonics over biotic evolution. This causal relationship positions ecologically specialized species as biotic indicators to decipher details of landscape evolution. Phylogeographic statistics that reconstruct spatio-temporal details of evolutionary histories of aquatic species, including fishes, can reveal key events of drainage evolution, notably where geochronological resolution is insufficient. Where geochronological resolution is insufficient, phylogeographic statistics that reconstruct spatio-temporal details of evolutionary histories of aquatic species, notably fishes, can reveal key events of drainage evolution. This study evaluates paleo-environmental causes of mitochondrial DNA (mtDNA) based phylogeographic records of tigerfishes, genus Hydrocynus, in order to reconstruct their evolutionary history in relation to landscape evolution across Africa. Strong geographical structuring in a cytochrome b (cyt-b) gene phylogeny confirms the established morphological diversity of Hydrocynus and reveals the existence of five previously unknown lineages, with Hydrocynus tanzaniae sister to a clade comprising three previously unknown lineages (Groups B, C and D) and H. vittatus. The dated phylogeny constrains the principal cladogenic events that have structured Hydrocynus diversity from the late Miocene to the Plio-Pleistocene (ca. 0–16 Ma). Phylogeographic tests reveal that the diversity and distribution of Hydrocynus reflects a complex history of vicariance and dispersals, whereby range expansions in particular species testify to changes to drainage basins. Principal divergence events in Hydrocynus have interfaced closely with evolving drainage systems across tropical Africa. Tigerfish evolution is attributed to dominant control by pulses of geotectonism across the African plate. Phylogenetic relationships and divergence estimates among the ten mtDNA lineages illustrates where and when local tectonic events modified Africa's Neogene drainage. Haplotypes shared amongst extant Hydrocynus populations across northern Africa testify to recent dispersals that were facilitated by late Neogene connections across the Nilo-Sahelian drainage. These events in tigerfish evolution concur broadly with available geological evidence and reveal prominent control by the African Rift System, evident in the formative events archived in phylogeographic records of tigerfish

    Gene-Specific Signatures of Elevated Non-Synonymous Substitution Rates Correlate Poorly across the Plasmodium Genus

    Get PDF
    BACKGROUND: Comparative genome analyses of parasites allow large scale investigation of selective pressures shaping their evolution. An acute limitation to such analysis of Plasmodium falciparum is that there is only very partial low-coverage genome sequence of the most closely related species, the chimpanzee parasite P. reichenowi. However, if orthologous genes have been under similar selective pressures throughout the Plasmodium genus then positive selection on the P. falciparum lineage might be predicted to some extent by analysis of other lineages. PRINCIPAL FINDINGS: Here, three independent pairs of closely related species in different sub-generic clades (P. falciparum and P. reichenowi; P. vivax and P. knowlesi; P. yoelii and P. berghei) were compared for a set of 43 candidate ligand genes considered likely to be under positive directional selection and a set of 102 control genes for which there was no selective hypothesis. The ratios of non-synonymous to synonymous substitutions (dN/dS) were significantly elevated in the candidate ligand genes compared to control genes in each of the three clades. However, the rank order correlation of dN/dS ratios for individual candidate genes was very low, less than the correlation for the control genes. SIGNIFICANCE: The inability to predict positive selection on a gene in one lineage by identifying elevated dN/dS ratios in the orthologue within another lineage needs to be noted, as it reflects that adaptive mutations are generally rare events that lead to fixation in individual lineages. Thus it is essential to complete the genome sequences of particular species of phylogenetic importance, such as P. reichenowi

    Genome Sequence of Fusobacterium nucleatum Subspecies Polymorphum β€” a Genetically Tractable Fusobacterium

    Get PDF
    Fusobacterium nucleatum is a prominent member of the oral microbiota and is a common cause of human infection. F. nucleatum includes five subspecies: polymorphum, nucleatum, vincentii, fusiforme, and animalis. F. nucleatum subsp. polymorphum ATCC 10953 has been well characterized phenotypically and, in contrast to previously sequenced strains, is amenable to gene transfer. We sequenced and annotated the 2,429,698 bp genome of F. nucleatum subsp. polymorphum ATCC 10953. Plasmid pFN3 from the strain was also sequenced and analyzed. When compared to the other two available fusobacterial genomes (F. nucleatum subsp. nucleatum, and F. nucleatum subsp. vincentii) 627 open reading frames unique to F. nucleatum subsp. polymorphum ATCC 10953 were identified. A large percentage of these mapped within one of 28 regions or islands containing five or more genes. Seventeen percent of the clustered proteins that demonstrated similarity were most similar to proteins from the clostridia, with others being most similar to proteins from other gram-positive organisms such as Bacillus and Streptococcus. A ten kilobase region homologous to the Salmonella typhimurium propanediol utilization locus was identified, as was a prophage and integrated conjugal plasmid. The genome contains five composite ribozyme/transposons, similar to the CdISt IStrons described in Clostridium difficile. IStrons are not present in the other fusobacterial genomes. These findings indicate that F. nucleatum subsp. polymorphum is proficient at horizontal gene transfer and that exchange with the Firmicutes, particularly the Clostridia, is common

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research
    • …
    corecore