237 research outputs found

    Image Segmentation by Fuzzy C-Means Clustering Algorithm with a Novel Penalty Term

    Get PDF
    To overcome the noise sensitiveness of conventional fuzzy c-means (FCM) clustering algorithm, a novel extended FCM algorithm for image segmentation is presented in this paper. The algorithm is developed by modifying the objective function of the standard FCM algorithm with a penalty term that takes into account the influence of the neighboring pixels on the centre pixels. The penalty term acts as a regularizer in this algorithm, which is inspired from the neighborhood expectation maximization algorithm and is modified in order to satisfy the criterion of the FCM algorithm. The performance of our algorithm is discussed and compared to those of many derivatives of FCM algorithm. Experimental results on segmentation of synthetic and real images demonstrate that the proposed algorithm is effective and robust

    Sequencing genes in silico using single nucleotide polymorphisms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs) discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive.</p> <p>Results</p> <p>To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS), which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles) at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%). This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC) Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes.</p> <p>Conclusions</p> <p>Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate genes for more detailed functional and mechanistic studies.</p

    Long Noncoding RNA Can Be a Probable Mechanism and a Novel Target for Diagnosis and Therapy in Fragile X Syndrome

    Get PDF
    Fragile X syndrome (FXS) is the most common congenital hereditary disease of low intelligence after Down syndrome. Its main pathogenic gene is fragile X mental retardation 1 (FMR1) gene associated with intellectual disability, autism, and fragile X-related primary ovarian insufficiency (FXPOI) and fragile X-associated tremor/ataxia syndrome (FXTAS). FMR1 gene transcription leads to the absence of fragile X mental retardation protein (FMRP). How to relieve or cure disorders associated with FXS has also become a clinically disturbing problem. Previous studies have recently shown that long noncoding RNAs (lncRNAs) contribute to the pathogenesis. And it has been identified that several lncRNAs including FMR4, FMR5, and FMR6 contribute to developing FXPOI/FXTAS, originating from the FMR1 gene locus. FMR4 is a product of RNA polymerase II and can regulate the expression of relevant genes during differentiation of human neural precursor cells. FMR5 is a sense-oriented transcript while FMR6 is an antisense lncRNA produced by the 3′ UTR of FMR1. FMR6 is likely to contribute to developing FXPOI, and it overlaps exons 15–17 of FMR1 as well as two microRNA binding sites. Additionally, BC1 can bind FMRP to form an inhibitory complex and lncRNA TUG1 also can control axonal development by directly interacting with FMRP through modulating SnoN–Ccd1 pathway. Therefore, these lncRNAs provide pharmaceutical targets and novel biomarkers. This review will: (1) describe the clinical manifestations and traditional pathogenesis of FXS and FXTAS/FXPOI; (2) summarize what is known about the role of lncRNAs in the pathogenesis of FXS and FXTAS/FXPOI; and (3) provide an outlook of potential effects and future directions of lncRNAs in FXS and FXTAS/FXPOI researches

    Embedding-based Retrieval in Facebook Search

    Full text link
    Search in social networks such as Facebook poses different challenges than in classical web search: besides the query text, it is important to take into account the searcher's context to provide relevant results. Their social graph is an integral part of this context and is a unique aspect of Facebook search. While embedding-based retrieval (EBR) has been applied in eb search engines for years, Facebook search was still mainly based on a Boolean matching model. In this paper, we discuss the techniques for applying EBR to a Facebook Search system. We introduce the unified embedding framework developed to model semantic embeddings for personalized search, and the system to serve embedding-based retrieval in a typical search system based on an inverted index. We discuss various tricks and experiences on end-to-end optimization of the whole system, including ANN parameter tuning and full-stack optimization. Finally, we present our progress on two selected advanced topics about modeling. We evaluated EBR on verticals for Facebook Search with significant metrics gains observed in online A/B experiments. We believe this paper will provide useful insights and experiences to help people on developing embedding-based retrieval systems in search engines.Comment: 9 pages, 3 figures, 3 tables, to be published in KDD '2

    Integrative Analysis of Genome and Expression Profile Data Reveals the Genetic Mechanism of the Diabetic Pathogenesis in Goto Kakizaki (GK) Rats

    Get PDF
    The Goto Kakizaki (GK) rats which can spontaneously develop type 2 diabetes (T2D), are generated by repeated inbreeding of Wistar rats with glucose intolerance. The glucose intolerance in GK rat is mainly attributed to the impairment in glucose-stimulated insulin secretion (GSIS). In addition, GK rat display a decrease in beta cell mass, and a change in insulin action. However, the genetic mechanism of these features remain unclear. In the present study, we analyzed the population variants of GK rats and control Wistar rats by whole genome sequencing and identified 1,839 and 1,333 specific amino acid changed (SAAC) genes in GK and Wistar rats, respectively. We also detected the putative artificial selective sweeps (PASS) regions in GK rat which were enriched with GK fixed variants and were under selected in the initial diabetic-driven derivation by homogeneity test with the fixed and polymorphic sites between GK and Wistar populations. Finally, we integrated the SAAC genes, PASS region genes and differentially expressed genes in GK pancreatic beta cells to reveal the genetic mechanism of the impairment in GSIS, a decrease in beta cell mass, and a change in insulin action in GK rat. The results showed that Slc2a2 gene was related to impaired glucose transport and Adcy3, Cacna1f, Bmp4, Fam3b, and Ptprn2 genes were related to Ca2+ channel dysfunction which may responsible for the impaired GSIS. The genes Hnf4g, Bmp4, and Bad were associated with beta cell development and may be responsible for a decrease in beta cell mass while genes Ide, Ppp1r3c, Hdac9, Ghsr, and Gckr may be responsible for the change in insulin action in GK rats. The overexpression or inhibition of Bmp4, Fam3b, Ptprn2, Ide, Hnf4g, and Bad has been reported to change the glucose tolerance in rodents. However, the genes Bmp4, Fam3b, and Ptprn2 were found to be associated with diabetes in GK rats for the first time in the present study. Our findings provide a comprehensive genetic map of the abnormalities in GK genome which will be helpful in understand the underlying genetic mechanism of pathogenesis of diabetes in GK rats

    Identifying hypermethylated CpG islands using a quantile regression model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>DNA methylation has been shown to play an important role in the silencing of tumor suppressor genes in various tumor types. In order to have a system-wide understanding of the methylation changes that occur in tumors, we have developed a differential methylation hybridization (DMH) protocol that can simultaneously assay the methylation status of all known CpG islands (CGIs) using microarray technologies. A large percentage of signals obtained from microarrays can be attributed to various measurable and unmeasurable confounding factors unrelated to the biological question at hand. In order to correct the bias due to noise, we first implemented a quantile regression model, with a quantile level equal to 75%, to identify hypermethylated CGIs in an earlier work. As a proof of concept, we applied this model to methylation microarray data generated from breast cancer cell lines. However, we were unsure whether 75% was the best quantile level for identifying hypermethylated CGIs. In this paper, we attempt to determine which quantile level should be used to identify hypermethylated CGIs and their associated genes.</p> <p>Results</p> <p>We introduce three statistical measurements to compare the performance of the proposed quantile regression model at different quantile levels (95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%), using known methylated genes and unmethylated housekeeping genes reported in breast cancer cell lines and ovarian cancer patients. Our results show that the quantile levels ranging from 80% to 90% are better at identifying known methylated and unmethylated genes.</p> <p>Conclusions</p> <p>In this paper, we propose to use a quantile regression model to identify hypermethylated CGIs by incorporating probe effects to account for noise due to unmeasurable factors. Our model can efficiently identify hypermethylated CGIs in both breast and ovarian cancer data.</p
    • …
    corecore