3,918 research outputs found
Graph-Embedding Empowered Entity Retrieval
In this research, we improve upon the current state of the art in entity
retrieval by re-ranking the result list using graph embeddings. The paper shows
that graph embeddings are useful for entity-oriented search tasks. We
demonstrate empirically that encoding information from the knowledge graph into
(graph) embeddings contributes to a higher increase in effectiveness of entity
retrieval results than using plain word embeddings. We analyze the impact of
the accuracy of the entity linker on the overall retrieval effectiveness. Our
analysis further deploys the cluster hypothesis to explain the observed
advantages of graph embeddings over the more widely used word embeddings, for
user tasks involving ranking entities
Adaptive HIV-1 evolutionary trajectories are constrained by protein stability
Despite the use of combination antiretroviral drugs for the treatment of HIV-1 infection, the emergence of drug resistance remains
a problem. Resistance may be conferred either by a single mutation or a concerted set of mutations. The involvement
of multiple mutations can arise due to interactions between sites in the amino acid sequence as a consequence of the need to
maintain protein structure. To better understand the nature of such epistatic interactions, we reconstructed the ancestral sequences
of HIV-1’s Pol protein, and traced the evolutionary trajectories leading to mutations associated with drug resistance.
Using contemporary and ancestral sequences we modelled the effects of mutations (i.e. amino acid replacements) on protein
structure to understand the functional effects of residue changes. Although the majority of resistance-associated sequences
tend to destabilise the protein structure, we find there is a general tendency for protein stability to decrease across HIV-1’s
evolutionary history. That a similar pattern is observed in the non-drug resistance lineages indicates that non-resistant mutations,
for example, associated with escape from the immune response, also impacts on protein stability. Maintenance of optimal
protein structure therefore represents a major constraining factor to the evolution of HIV-1
Modular Biological Function Is Most Effectively Captured by Combining Molecular Interaction Data Types
PublishedLarge-scale molecular interaction data sets have the potential to provide a comprehensive, system-wide understanding of biological function. Although individual molecules can be promiscuous in terms of their contribution to function, molecular functions emerge from the specific interactions of molecules giving rise to modular organisation. As functions often derive from a range of mechanisms, we demonstrate that they are best studied using networks derived from different sources. Implementing a graph partitioning algorithm we identify subnetworks in yeast protein-protein interaction (PPI), genetic interaction and gene co-regulation networks. Among these subnetworks we identify cohesive subgraphs that we expect to represent functional modules in the different data types. We demonstrate significant overlap between the subgraphs generated from the different data types and show these overlaps can represent related functions as represented by the Gene Ontology (GO). Next, we investigate the correspondence between our subgraphs and the Gene Ontology. This revealed varying degrees of coverage of the biological process, molecular function and cellular component ontologies, dependent on the data type. For example, subgraphs from the PPI show enrichment for 84%, 58% and 93% of annotated GO terms, respectively. Integrating the interaction data into a combined network increases the coverage of GO. Furthermore, the different annotation types of GO are not predominantly associated with one of the interaction data types. Collectively our results demonstrate that successful capture of functional relationships by network data depends on both the specific biological function being characterised and the type of network data being used. We identify functions that require integrated information to be accurately represented, demonstrating the limitations of individual data types. Combining interaction subnetworks across data types is therefore essential for fully understanding the complex and emergent nature of biological function.JIM was funded by a Biotechnology and Biological Sciences Research Council (BBSRC) CASE studentship with industry partner Pfizer and RMA by a BBSRC studentship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Gene Duplication and Environmental Adaptation within Yeast Populations
PublishedPopulation-level differences in the number of copies of genes resulting from gene duplication and loss have recently been recognized as an important source of variation in eukaryotes. However, except for a small number of cases, the phenotypic effects of this variation are unknown. Data from the Saccharomyces Genome Resequencing Project permit the study of duplication in genome sequences from a set of individuals within the same population. These sequences can be correlated with available information on the environments from which these yeast strains were isolated. We find that yeast show an abundance of duplicate genes that are lineage specific, leading to a large degree of variation in gene content between individual strains. There is a detectable bias for specific functions, indicating that selection is acting to preferentially retain certain duplicates. Most strikingly, we find that sets of over- and underrepresented duplicates correlate with the environment from which they were isolated. Together, these observations indicate that gene duplication can give rise to substantial phenotypic differences within populations that in turn can offer a shortcut to evolutionary adaptation.This work was funded by BBSRC grant BB/F007620/1
Evolution of the Gene Lineage Encoding the Carbon Dioxide Receptor in Insects
A heterodimer of the insect chemoreceptors Gr21a and Gr63a has been shown to be the carbon dioxide receptor in Drosophila melanogaster (Meigen) (Diptera: Drosophilidae). Comparison of the genes encoding these two proteins across the 12 available drosophilid fly genomes allows refined definition of their N-termini. These genes are highly conserved, along with a paralog of Gr21a, in the Anopheles gambiae, Aedes aegypti, and Culex pipiens mosquitoes, as well as in the silk moth Bombyx mori and the red flour beetle Tribolium castaneum. In the latter four species we name these three proteins Gr1, Gr2, and Gr3. Intron evolution within this distinctive three gene lineage is considerable, with at least 13 inferred gains and 39 losses. Surprisingly, this entire ancient gene lineage is absent from all other available more basal insect and related arthropod genomes, specifically the honey bee, parasitoid wasp, human louse, pea aphid, waterflea, and blacklegged tick genomes. At least two of these species can detect carbon dioxide, suggesting that they evolved other means to do so
The Universal Plausibility Metric (UPM) & Principle (UPP)
<p>Abstract</p> <p>Background</p> <p>Mere possibility is not an adequate basis for asserting scientific plausibility. A precisely defined universal bound is needed beyond which the assertion of <it>plausibility</it>, particularly in life-origin models, can be considered operationally falsified. But can something so seemingly relative and subjective as plausibility ever be quantified? Amazingly, the answer is, "Yes." A method of objectively measuring the plausibility of any chance hypothesis (The Universal Plausibility Metric [UPM]) is presented. A numerical inequality is also provided whereby any chance hypothesis can be definitively falsified when its UPM metric of ξ is < 1 (The Universal Plausibility Principle [UPP]). Both UPM and UPP pre-exist and are independent of any experimental design and data set.</p> <p>Conclusion</p> <p>No low-probability hypothetical plausibility assertion should survive peer-review without subjection to the UPP inequality standard of formal falsification (ξ < 1).</p
The elastic constants of MgSiO3 perovskite at pressures and temperatures of the Earth's mantle
The temperature anomalies in the Earth's mantle associated with thermal
convection1 can be inferred from seismic tomography, provided that the elastic
properties of mantle minerals are known as a function of temperature at mantle
pressures. At present, however, such information is difficult to obtain
directly through laboratory experiments. We have therefore taken advantage of
recent advances in computer technology, and have performed finite-temperature
ab initio molecular dynamics simulations of the elastic properties of MgSiO3
perovskite, the major mineral of the lower mantle, at relevant thermodynamic
conditions. When combined with the results from tomographic images of the
mantle, our results indicate that the lower mantle is either significantly
anelastic or compositionally heterogeneous on large scales. We found the
temperature contrast between the coldest and hottest regions of the mantle, at
a given depth, to be about 800K at 1000 km, 1500K at 2000 km, and possibly over
2000K at the core-mantle boundary.Comment: Published in: Nature 411, 934-937 (2001
An isolate of human immunodeficiency virus type 1 originally classified as subtype I represents a complex mosaic comprising three different group M subtypes (A, G, and I)
Full-length reference clones and sequences are currently available for eight human immunodeficiency virus type 1 (HIV-1) group M subtypes (A through H), but none have been reported for subtypes I and J, which have only been identified in a few individuals. Phylogenetic information for subtype I, in particular, is limited since only about 400 bp of env gene sequences have been determined for just two epidemiologically linked viruses infecting a couple who were heterosexual intravenous drug users from Cyprus. To characterize subtype I in greater detail, we employed long-range PCR to clone a full-length provirus (94CY032.3) from an isolate obtained from one of the individuals originally reported to be infected with this subtype. Phylogenetic analysis of C2-V3 env gene sequences confirmed that 94CY032.3 was closely related to sequences previously classified as subtype I. However, analysis of the remainder of its genome revealed various regions in which 94CY032.3 was significantly clustered with either subtype A or subtype G. Only sequences located in vpr and nef, as well as the middle portions of pol and env, formed independent lineages roughly equidistant from all other known subtypes. Since these latter regions most likely have a common origin, we classify them all as subtype I. These results thus indicate that the originally reported prototypic subtype I isolate 94CY032 represents a triple recombinant (A/G/I) with at least 11 points of recombination crossover. We also screened HIV-1 recombinants with regions of uncertain subtype assignment for the presence of subtype I sequences. This analysis revealed that two of the earliest mosaics from Africa, Z321B (A/G/?) and MAL (A/D/?), contain short segments of sequence which clustered closely with the subtype I domains of 94CY032.3. Since Z321 was isolated in 1976, subtype I as well as subtypes A and G must have existed in Central Africa prior to that date... (D'après résumé d'auteur
- …