2,190 research outputs found

    Bayesian Clustering of Transcription Factor Binding Motifs

    Get PDF
    Genes are often regulated in living cells by proteins called transcription factors that bind directly to short segments of DNA in close proximity to specific genes. These binding sites have a conserved nucleotide appearance, which is called a motif. Several recent studies of transcriptional regulation require the reduction of a large collection of motifs into clusters based on the similarity of their nucleotide composition. We present a principled approach to this clustering problem based on a Bayesian hierarchical model that accounts for both within- and between-motif variability. We use a Dirichlet process prior distribution that allows the number of clusters to vary and we also present a novel generalization that allows the core width of each motif to vary. This clustering model is implemented, using a Gibbs sampling strategy, on several collections of transcription factor motif matrices. Our stochastic implementation allows us to examine the variability of our results in addition to focusing on a set of best clusters. Our clustering results identify several motif clusters that suggest that several transcription factor protein families are actually mixtures of several smaller groups of highly similar motifs, which provide substantially more refined information compared with the full set of motifs in the family. Our clusters provide a means by which to organize transcription factors based on binding motif similarities and can be used to reduce motif redundancy within large databases such as JASPAR and TRANSFAC, which aides the use of these databases for further motif discovery. Finally, our clustering procedure has been used in combination with discovery of evolutionarily conserved motifs to predict co-regulated genes. An alternative to our Dirichlet process prior distribution is presented that differs substantially in terms of a priori clustering characteristics, but shows no substantive difference in the clustering results for our dataset. Despite our specific application to transcription factor binding motifs, our Bayesian clustering model based on the Dirichlet process has several advantages over traditional clustering methods that could make our procedure appropriate and useful for many clustering applications

    Enhancing Usability of Information Extraction Results with Textual Data Profiling

    Get PDF
    PACLIC 19 / Taipei, taiwan / December 1-3, 200

    CsrA impacts survival of Yersinia enterocolitica by affecting a myriad of physiological activities.

    Get PDF
    BackgroundA previous study identified a Yersinia enterocolitica transposon mutant, GY448, that was unable to export the flagellar type three secretion system (T3SS)-dependent phospholipase, YplA. This strain was also deficient for motility and unable to form colonies on Lauria-Bertani agar medium. Preliminary analysis suggested it carried a mutation in csrA. CsrA in Escherichia coli is an RNA-binding protein that is involved in specific post-transcriptional regulation of a myriad of physiological activities. This study investigated how CsrA affects expression of the flagellar regulatory cascade that controls YplA export and motility. It also explored the effect of csrA mutation on Y. enterocolitica in response to conditions that cue physiological changes important for growth in environments found both in nature and the laboratory.ResultsThe precise location of the transposon insertion in GMY448 was mapped within csrA. Genetic complementation restored disruptions in motility and the YplA export phenotype (Yex), which confirmed this mutation disrupted CsrA function. Mutation of csrA affected expression of yplA and flagellar genes involved in flagellar T3SS dependent export and motility by altering expression of the master regulators flhDC. Mutation of csrA also resulted in increased sensitivity of Y. enterocolitica to various osmolytes, temperatures and antibiotics.ConclusionsThe results of this study reveal unique aspects of how CsrA functions in Y. enterocolitica to control its physiology. This provides perspective on how the Csr system is susceptible to adaptation to particular environments and bacterial lifestyles

    Computational Discovery of Gene Regulatory Binding Motifs: A Bayesian Perspective

    Get PDF
    The Bayesian approach together with Markov chain Monte Carlo techniques has provided an attractive solution to many important bioinformatics problems such as multiple sequence alignment, microarray analysis and the discovery of gene regulatory binding motifs. The employment of such methods and, more broadly, explicit statistical modeling, has revolutionized the field of computational biology. After reviewing several heuristics-based computational methods, this article presents a systematic account of Bayesian formulations and solutions to the motif discovery problem. Generalizations are made to further enhance the Bayesian approach. Motivated by the need of a speedy algorithm, we also provide a perspective of the problem from the viewpoint of optimizing a scoring function. We observe that scoring functions resulting from proper posterior distributions, or approximations to such distributions, showed the best performance and can be used to improve upon existing motif-finding programs. Simulation analyses and a real-data example are used to support our observation

    USING LIVE STUDENT PEER ASSESSMENT WITH AUTOMATED INSTANT FEEDBACK

    Get PDF
    Peer assessment and feedback enables students to develop objectivity in relation to standards which can then be transferred to their own work (Liu & Carless, 2006). However, providing feedback, particularly in large classes, can be labour intensive (eg. collating scores and comments). As such, it can be challenging to provide effective feedback in a timely manner which has been shown to promote retention and the correction of inaccurate responses (Epstein et al., 2002). We have recently utilised the online student data and engagement system (SRES, Liu et al., 2017) to run our peer assessments of our student oral presentations within our undergraduate chemistry laboratories. Students are able to grade their peers’ presentations in real time via mobile devices which is captured by SRES, alongside the Academic(s) grading. The system automatically collates both student and academics scores and immediately posts this grade and feedback to the Learning Management System (LMS) of the presenting student(s). Students have immediate access to this feedback to construct self-reflections or to discuss their performance with their teacher whilst the experience is still “fresh”. We will discuss its implementation and how it addresses topics such as mitigating academic misconduct, improving student engagement and reducing the academic burden in running these assessments. REFERENCES Epstein, M.L., Lazarus, A.D., Calvano, T.B. et al. (2002). Immediate Feedback Assessment Technique Promotes Learning and Corrects Inaccurate first Responses. The Psychological Record, 52, 187–201. Liu, D. Y. T., Bartimote-Aufflick, K., Pardo, A., & Bridgeman, A. J. (2017). Data-Driven Personalization of Student Learning Support in Higher Education. In Learning Analytics: Fundaments, Applications, and Trends, Peña-Ayala, A., Ed. Springer International Publishing. Liu, N. & Carless, D. (2006). Peer feedback: the learning element of peer assessment. Teaching in Higher Education, 11(3), 279-290

    Avian surface reconstruction in free-flight with application to flight stability analysis of a barn owl and peregrine falcon

    Get PDF
    Birds primarily create and control the forces necessary for flight through changing the shape and orientation of their wings and tail. Their wing geometry is characterised by complex variation in parameters such as camber, twist, sweep and dihedral. To characterise this complexity, a multi-stereo photogrammetry setup was developed for accurately measuring surface geometry in high-resolution during free-flight. The natural patterning of the birds was used as the basis for phase correlation-based image matching, allowing indoor or outdoor use while being non-intrusive for the birds. The accuracy of the method was quantified and shown to be sufficient for characterising the geometric parameters of interest, but with a reduction in accuracy close to the wing edge and in some localized regions. To demonstrate the method's utility, surface reconstructions are presented for a barn owl (Tyto alba) and peregrine falcon (Falco peregrinus) during three instants of gliding flight per bird. The barn owl flew with a consistent geometry, with positive wing camber and longitudinal anhedral. Based on flight dynamics theory this suggests it was longitudinally statically unstable during these flights. The peregrine flew with a consistent glide angle, but at a range of airspeeds with varying geometry. Unlike the barn owl, its glide configuration did not provide a clear indication of longitudinal static stability/instability. Aspects of the geometries adopted by both birds appeared to be related to control corrections and this method would be well suited for future investigations in this area, as well as for other quantitative studies into avian flight dynamics.Flight O1 - original uncompressed tif images for flight O1 of the barn owlO1_images.zipFlight O2 - original uncompressed tif images for flight O2 of the barn owlO2_images.zipFlight O3 - original uncompressed tif images for flight O3 of the barn owlO3_images.zipFlight P1 - original uncompressed tif images for flight P1 of the peregrineP1_images.zipFlight P2 - original uncompressed tif images for flight P2 of the peregrineP2_images.zipFlight P3 - original uncompressed tif images for flight P3 of the peregrineP3_images.zipREADM

    Bayesian Hyperbolic Multidimensional Scaling

    Full text link
    Multidimensional scaling (MDS) is a widely used approach to representing high-dimensional, dependent data. MDS works by assigning each observation a location on a low-dimensional geometric manifold, with distance on the manifold representing similarity. We propose a Bayesian approach to multidimensional scaling when the low-dimensional manifold is hyperbolic. Using hyperbolic space facilitates representing tree-like structures common in many settings (e.g. text or genetic data with hierarchical structure). A Bayesian approach provides regularization that minimizes the impact of measurement error in the observed data and assesses uncertainty. We also propose a case-control likelihood approximation that allows for efficient sampling from the posterior distribution in larger data settings, reducing computational complexity from approximately O(n2)O(n^2) to O(n)O(n). We evaluate the proposed method against state-of-the-art alternatives using simulations, canonical reference datasets, Indian village network data, and human gene expression data
    • …
    corecore