Search CORE

52 research outputs found

Methodology for the inference of gene function from phenotype data

Author: A Gómez
C Keller
CE Moase
CL Smith
David P Hill
DP Hill
F Azuaje
Gene Ontology Consortium
Gene Ontology Consortium
JA Blake
JL Ashurst
Joao A Ascensao
Judith A Blake
KR Degenhardt
M Deng
M Deng
Mary E Dolan
ME Smoot
MM Dickie
OD King
P Groth
P Kemmeren
P Lambrix
P Manda
R Hoehndorf
S Myhre
TE Oliphant
TR Gruber
Y Tao
Z Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Methodology for inferring gene function using phenotype data.

Author: Ascensao Joao A
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/07/2013
Field of study

The Jackson Laboratory: The Mouseion at the JAXlibrary

Quantifying the local adaptive landscape of a nascent bacterial community.

Author: Ascensao Joao A,
Publication venue
Publication date: 10/07/2023
Field of study

Ezid

Methodology for the inference of gene function from phenotype data.

Author: Ascensao Joao A
Blake Judith A
Dolan Mary E
Hill David P
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2014
Field of study

BackgroundBiomedical ontologies are increasingly instrumental in the advancement of biological research primarily through their use to efficiently consolidate large amounts of data into structured, accessible sets. However, ontology development and usage can be hampered by the segregation of knowledge by domain that occurs due to independent development and use of the ontologies. The ability to infer data associated with one ontology to data associated with another ontology would prove useful in expanding information content and scope. We here focus on relating two ontologies: the Gene Ontology (GO), which encodes canonical gene function, and the Mammalian Phenotype Ontology (MP), which describes non-canonical phenotypes, using statistical methods to suggest GO functional annotations from existing MP phenotype annotations. This work is in contrast to previous studies that have focused on inferring gene function from phenotype primarily through lexical or semantic similarity measures.ResultsWe have designed and tested a set of algorithms that represents a novel methodology to define rules for predicting gene function by examining the emergent structure and relationships between the gene functions and phenotypes rather than inspecting the terms semantically. The algorithms inspect relationships among multiple phenotype terms to deduce if there are cases where they all arise from a single gene function.We apply this methodology to data about genes in the laboratory mouse that are formally represented in the Mouse Genome Informatics (MGI) resource. From the data, 7444 rule instances were generated from five generalized rules, resulting in 4818 unique GO functional predictions for 1796 genes.ConclusionsWe show that our method is capable of inferring high-quality functional annotations from curated phenotype data. As well as creating inferred annotations, our method has the potential to allow for the elucidation of unforeseen, biologically significant associations between gene function and phenotypes that would be overlooked by a semantics-based approach. Future work will include the implementation of the described algorithms for a variety of other model organism databases, taking full advantage of the abundance of available high quality curated data. BMC Bioinformatics 2014; 15:405

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Springer - Publisher Connector

PubMed Central

Quantifying the local adaptive landscape of a nascent bacterial community

Author: Adam P. Arkin
Benjamin H. Good
Joao A. Ascensao
Kelly M. Wetmore
Oskar Hallatschek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

Fitness landscapes largely shape the dynamics of evolution, but it is unclear how they shift upon ecological diversification. By engineering genome-wide knockout libraries of a nascent bacterial community, Ascensao et al. show how ecological and epistatic patterns combine to shape adaptive landscapes

Directory of Open Access Journals

eScholarship - University of California

A Hidden Markov Model with continuous hidden and observed states (a Kalman filter) for inferring genetic drift and measurement noise from lineage frequency time series.

Author: Erik Volz (5013467)
Joao A. Ascensao (18369210)
Olivia Boyd (8338428)
Oskar Hallatschek (227781)
QinQin Yu (4848709)
Takashi Okada (162457)
Publication venue
Publication date: 15/04/2024
Field of study

(a) Illustration of how genetic drift and measurement noise affect the observed frequency time series. Muller plot of lineage frequencies from Wright-Fisher simulations with effective population size 500 and 5000, with and without measurement noise. In simulations with measurement noise, 100 sequences were sampled per week with the measurement noise overdispersion parameter ct = 5 (parameter defined in text). All simulations were initialized with 50 lineages at equal frequency. A lower effective population size leads to larger frequency fluctuations whose variances add over time, whereas measurement noise leads to increased frequency fluctuations whose variances do not add over time. (b) Schematic of Hidden Markov Model describing frequency trajectories. ft is the true frequency at time t (hidden states) and is the observed frequency at time t (observed states). The inferred parameters are , the effective population size scaled by the generation time, and ct, the overdispersion in measurement noise (ct = 1 corresponds to uniform sampling of sequences from the population). (c-f) Validation of method using Wright-Fisher simulations of frequency trajectories with time-varying effective population size and measurement noise. (c) Simulated number of sequences. (d) Simulated lineage frequency trajectories. (e) Inferred scaled effective population size () on simulated data compared to true values. (f) Inferred measurement noise (ct) on simulated data compared to true values. In (e) the shaded region shows the 95% confidence interval calculated using the posterior, and in (f) the shaded region shows the 95% confidence interval calculated using bootstrapping (see Methods).</p

FigShare

The inferred effective population size when cutting the tree at different depths to test the effect of combining lineages with other more closely related lineages in forming the coarse-grained lineages.

Author: Erik Volz (5013467)
Joao A. Ascensao (18369210)
Olivia Boyd (8338428)
Oskar Hallatschek (227781)
QinQin Yu (4848709)
Takashi Okada (162457)
Publication venue
Publication date: 15/04/2024
Field of study

The inferred effective population size when cutting the tree at different depths to test the effect of combining lineages with other more closely related lineages in forming the coarse-grained lineages.</p

FigShare

Inferred effective population size in regions of England.

Author: Erik Volz (5013467)
Joao A. Ascensao (18369210)
Olivia Boyd (8338428)
Oskar Hallatschek (227781)
QinQin Yu (4848709)
Takashi Okada (162457)
Publication venue
Publication date: 15/04/2024
Field of study

(Top panels) Inferred of pre-B.1.177 lineages, B.1.177, Alpha, and Delta for each region of England. The inferred for England as a whole is shown for reference. Shaded regions show 95% confidence intervals (see Methods). (Bottom panels) The ratio between the inferred of England and that of the region for each variant. A horizontal dashed line indicates a ratio of 1 (i.e. is the same in that region of England and England as a whole). Shared regions show the minimum and maximum possible values of the ratio from the combined error intervals of the numerator and denominator (thus, not corresponding to a specific confidence interval range). (PDF)</p

FigShare

Inferred scaled effective population size compared to the SIR model scaled population size calculated using the observed number of positive individuals in England (see Methods).

Author: Erik Volz (5013467)
Joao A. Ascensao (18369210)
Olivia Boyd (8338428)
Oskar Hallatschek (227781)
QinQin Yu (4848709)
Takashi Okada (162457)
Publication venue
Publication date: 15/04/2024
Field of study

Inferred scaled effective population size compared to the SIR model scaled population size calculated using the observed number of positive individuals in England (see Methods).</p

FigShare

Potential mechanisms that can generate a low effective population size.

Author: Erik Volz (5013467)
Joao A. Ascensao (18369210)
Olivia Boyd (8338428)
Oskar Hallatschek (227781)
QinQin Yu (4848709)
Takashi Okada (162457)
Publication venue
Publication date: 15/04/2024
Field of study

(a) Superspreading, where the distribution of the number of secondary cases (Z) from a single infected individual is broadly distributed (variance greater than mean). The superspreading individuals are indicated in blue. (b) Deme structure without superspreading, due to heterogeneity in the host network structure, where the distribution of the number of secondary cases is not broadly distributed (variance approximately equal to mean). (c) The ratio between the (the scaled population size calculated from an SIR model using the number of observed positive individuals and the observed effective reproduction number) and the inferred for each variant. Only data where the error in the SIR model is less than 3 times the value are shown, because larger error bars make it challenging to interpret the results. The inferred is lower than the (which assumes well-mixed dynamics and no superspreading) by a factor of 16 to 589, indicating high levels of genetic drift. The variance in offspring number from the literature does not entirely explain the discrepancy between the true and effective population sizes. (d) Simulations of deme structure without superspreading can generate high levels of genetic drift via jackpot events. SEIR dynamics are simulated within demes (with Rt = 10, i.e. deterministic transmission) and Poisson transmission is simulated between demes (Rt ≪ 1, i.e. stochastic transmission) such that the population Rt ∼ 1 (see Methods). Simulation parameters are: mean transition rate from exposed to infected γE = (2.5 days)−1, mean transition rate from infected to recovered γI = (6.5 days)−1, total number of demes Dtotal = 5.6 × 105. The ratio between the number of infected individuals and the inferred effective population size is found to scale linearly with the deme size and not with the number of infected demes. This scaling results because of jackpot events where a lineage that happens to infect a susceptible deme grows rapidly until all susceptible individuals in the deme are infected.</p

FigShare