19 research outputs found

    Study of markers for regulatory elements in human genome

    Get PDF
    Most genetic traits and diseases in humans from height to cancer or sudden cardiac death do not follow Mendelian principles but originate from complex combinatorial effects of multiple genes with possibly multiple variants. Most of these variants lie within non-coding regions of the genome such as promoters, enhances or insulators, which regulate the expression levels of genes. Numerous algorithms predict the likely location of these regulatory regions using biological features such as conservation, transcription factor binding, deoxyribonuclease I (DNaseI) hypersensitivity, and others. The first part of the thesis presents a software to compile such annotations and visualize them in a customizable manner. The second part discusses the distribution of one of these features, DNaseI sensitivity, across the human genome. In the first part, we developed a software and used it to study the NOS1AP (NO-synthase adapter protein) gene locus and the beta-globin gene locus. Since, single nucleotide polymorphisms (SNPs) at NOS1AP locus are known to affect the electro-cardiographic QT-interval, we collected the corresponding data from a genome-wide association study. We plotted the genetic effect and frequency of these SNPs across the length of the NOS1AP locus, along with genes and other functional annotations from various public databases including RefSeq, University of California Santa Cruz (UCSC) Genome Browser, TRANSFAC, and the Encyclopedia of DNA Elements (ENCODE) project. We also added SNPs from the 1000 Genomes project to increase the available number of variants to analyze. We observed a lack of known annotations at almost all variants, which led to the following possibility: although particular regions of the human genome may not be significant enough to be designated as regulatory regions, there may still be weak sites affecting overall gene expression. This was the motivation to study the distribution of DNaseI sensitivity across the human genome, which forms the second part of the thesis. In the second part, we modeled DNaseI sensitivity, a marker for chromatin accessibility and regulatory elements, using data collected by the University of Washington (UW) as part of the ENCODE project. We used Gamma-weighted Poisson distribution as our model and normal Poisson distribution as noise. Maximum-likelihood estimation fitting over the entire genome as well as over individual chromosomes, across different cell lines, indicated that most of the human genome is inactive, and the remainder has generally very low DNaseI sensitivity. Only a very small fraction of the genome (<1%) is DNaseI hypersensitive

    Do text-free diffusion models learn discriminative visual representations?

    Full text link
    While many unsupervised learning models focus on one family of tasks, either generative or discriminative, we explore the possibility of a unified representation learner: a model which addresses both families of tasks simultaneously. We identify diffusion models, a state-of-the-art method for generative tasks, as a prime candidate. Such models involve training a U-Net to iteratively predict and remove noise, and the resulting model can synthesize high-fidelity, diverse, novel images. We find that the intermediate feature maps of the U-Net are diverse, discriminative feature representations. We propose a novel attention mechanism for pooling feature maps and further leverage this mechanism as DifFormer, a transformer feature fusion of features from different diffusion U-Net blocks and noise steps. We also develop DifFeed, a novel feedback mechanism tailored to diffusion. We find that diffusion models are better than GANs, and, with our fusion and feedback mechanisms, can compete with state-of-the-art unsupervised image representation learning methods for discriminative tasks - image classification with full and semi-supervision, transfer for fine-grained classification, object detection and segmentation, and semantic segmentation. Our project website (https://mgwillia.github.io/diffssl/) and code (https://github.com/soumik-kanad/diffssl) are available publicly.Comment: Website: see https://mgwillia.github.io/diffssl/ . Code: see https://github.com/soumik-kanad/diffssl . The first two authors contributed equally. 15 pages, 9 figures, 15 tables. Submission under review. (this article supersedes arXiv:2307.08702

    Study of markers for regulatory elements in human genome

    No full text
    Most genetic traits and diseases in humans from height to cancer or sudden cardiac death do not follow Mendelian principles but originate from complex combinatorial effects of multiple genes with possibly multiple variants. Most of these variants lie within non-coding regions of the genome such as promoters, enhances or insulators, which regulate the expression levels of genes. Numerous algorithms predict the likely location of these regulatory regions using biological features such as conservation, transcription factor binding, deoxyribonuclease I (DNaseI) hypersensitivity, and others. The first part of the thesis presents a software to compile such annotations and visualize them in a customizable manner. The second part discusses the distribution of one of these features, DNaseI sensitivity, across the human genome. In the first part, we developed a software and used it to study the NOS1AP (NO-synthase adapter protein) gene locus and the beta-globin gene locus. Since, single nucleotide polymorphisms (SNPs) at NOS1AP locus are known to affect the electro-cardiographic QT-interval, we collected the corresponding data from a genome-wide association study. We plotted the genetic effect and frequency of these SNPs across the length of the NOS1AP locus, along with genes and other functional annotations from various public databases including RefSeq, University of California Santa Cruz (UCSC) Genome Browser, TRANSFAC, and the Encyclopedia of DNA Elements (ENCODE) project. We also added SNPs from the 1000 Genomes project to increase the available number of variants to analyze. We observed a lack of known annotations at almost all variants, which led to the following possibility: although particular regions of the human genome may not be significant enough to be designated as regulatory regions, there may still be weak sites affecting overall gene expression. This was the motivation to study the distribution of DNaseI sensitivity across the human genome, which forms the second part of the thesis. In the second part, we modeled DNaseI sensitivity, a marker for chromatin accessibility and regulatory elements, using data collected by the University of Washington (UW) as part of the ENCODE project. We used Gamma-weighted Poisson distribution as our model and normal Poisson distribution as noise. Maximum-likelihood estimation fitting over the entire genome as well as over individual chromosomes, across different cell lines, indicated that most of the human genome is inactive, and the remainder has generally very low DNaseI sensitivity. Only a very small fraction of the genome (<1%) is DNaseI hypersensitive

    PDBalert: automatic, recurrent remote homology tracking and protein structure prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>During the last years, methods for remote homology detection have grown more and more sensitive and reliable. Automatic structure prediction servers relying on these methods can generate useful 3D models even below 20% sequence identity between the protein of interest and the known structure (template). When no homologs can be found in the protein structure database (PDB), the user would need to rerun the same search at regular intervals in order to make timely use of a template once it becomes available.</p> <p>Results</p> <p>PDBalert is a web-based automatic system that sends an email alert as soon as a structure with homology to a protein in the user's watch list is released to the PDB database or appears among the sequences on hold. The mail contains links to the search results and to an automatically generated 3D homology model. The sequence search is performed with the same software as used by the very sensitive and reliable remote homology detection server HHpred, which is based on pairwise comparison of Hidden Markov models.</p> <p>Conclusion</p> <p>PDBalert will accelerate the information flow from the PDB database to all those who can profit from the newly released protein structures for predicting the 3D structure or function of their proteins of interest.</p

    Percutaneous vertebroplasty: An experience of 31 procedures

    No full text
    A prospective study of 31 percutaneous vertebroplasty procedures (PVP) in 22 patients treated during January 2000 to December 2001 is presented. PVP was performed using polymethylmethacrylate (PMMA) to treat vertebral collapse due to osteoporosis and vertebral metastasis, to obtain analgesia and spinal stabilization. We analyze the efficacy and complications related to the procedure. PVP is a safe, effective and a daycare surgery. It can be performed under local anesthesia and has minimal and manageable complications

    Analysis of Whole-Brain Resting-State MRI Using Multi-Label Deformable Offset Networks and Segmentations Based Attention with Explorations into the Ethical Implications of Artificial Intelligence in Clinical Psychiatry Settings and Care

    No full text
    Gemstone Team MINDDue to the poor understanding of the underlying biological mechanisms of psychiatric disorders, diagnoses rely upon symptomatic criteria and clinicians’ discretion. Reviews of these criteria have revealed issues of heterogeneity, over and under specificity, and symptom overlap between disorders. Deep learning provides a method to produce quantifiable diagnostic labels based upon biological markers such as specific features of brain anatomy or functionality. In practice, these methods fail to indicate how a particular result was determined, raising major obstacles for clinical implementation.To improve the efficiency and interpretability of existing deep networks, we have developed a novel atlas-based attention module to more easily capture global information across different areas of brain function. Our model can be extended to symptom level classification using NIMH data to give clinicians usable information outside of broad disorder classification. We have compared our model against leading 3D deep learning frameworks and have shown that our novel atlas-based attention module achieves 88% F1 and 91% accuracy on the UCLA Consortium for Neuropsychiatric Phenomics dataset. We have embedded our model with elements like deformable convolutions, gradient activation visualizations, and occlusion testing to show model attention and function. In addition to the lack of explainability, addressing the ethical issues surrounding clinical implementation of artificial intelligence is necessary before usage can become a reality. We identified a series of regulatory recommendations to address pertinent ethical concerns of equity and bias during both model development and clinical usage. We propose a standardized protocol for developing a clinical reference standard, the development of diversity reports regarding data used by models, and regulation of usage scenarios to reduce contextual bias

    Classification results with varied parameters.

    No full text
    <p>A) The KNN classifiers were tested by varying number of neighbors, k from 1 to 7. The plot shows average accuracy for each k. k = 1 and k = 2 resulted in the best performance. B) PCA-LDA classification result with varied number of eigenvectors. Our PCA-LDA classifiers were tested for dimensionality reduction varied from one through seven different eigenvectors. The plot shows the highest accuracy when using six eigenvectors.</p

    Accuracy of different classifiers under different conditions.

    No full text
    <p>Horizontal axis shows the different Na+, K+ and Mg2+ concentrations respectively that were used to generate the predict curves. Vertical axis shows accuracy in %age. Different curves labeled with different legends represent the performance of different classifiers.</p
    corecore