88 research outputs found

    Identifying interacting genetic variations by fish-swarm logic regression

    Get PDF
    Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR) is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR), which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds

    The Missed Patient With Diabetes: How access to health care affects the detection of diabetes

    Get PDF
    OBJECTIVE—This study examined the association between access to health care and three classifications of diabetes status: diagnosed, undiagnosed, and no diabetes

    Query-dominant User Interest Network for Large-Scale Search Ranking

    Full text link
    Historical behaviors have shown great effect and potential in various prediction tasks, including recommendation and information retrieval. The overall historical behaviors are various but noisy while search behaviors are always sparse. Most existing approaches in personalized search ranking adopt the sparse search behaviors to learn representation with bottleneck, which do not sufficiently exploit the crucial long-term interest. In fact, there is no doubt that user long-term interest is various but noisy for instant search, and how to exploit it well still remains an open problem. To tackle this problem, in this work, we propose a novel model named Query-dominant user Interest Network (QIN), including two cascade units to filter the raw user behaviors and reweigh the behavior subsequences. Specifically, we propose a relevance search unit (RSU), which aims to search a subsequence relevant to the query first and then search the sub-subsequences relevant to the target item. These items are then fed into an attention unit called Fused Attention Unit (FAU). It should be able to calculate attention scores from the ID field and attribute field separately, and then adaptively fuse the item embedding and content embedding based on the user engagement of past period. Extensive experiments and ablation studies on real-world datasets demonstrate the superiority of our model over state-of-the-art methods. The QIN now has been successfully deployed on Kuaishou search, an online video search platform, and obtained 7.6% improvement on CTR.Comment: 10 page

    Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis.

    Get PDF
    Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    Auditory Cryptography Security Algorithm With Audio Shelters

    Get PDF
    AbstractIn this paper, auditory cryptography security algorithm with audio shelters is proposed. The meaningful audio watermarking is pretreated to high-fidelity binary audio, and the binary audio is encrypted to n cryptographic audios by (k, n) threshold scheme. Less than k of the cryptographic audios give no information, only synchronized playing k or more than k of the audios the original can be heard directly. The n cryptographic audios are embedded in the corresponding n shelter audios which are pretreated by high-dimensional matrix transformation. Experiments show that the proposed algorithm has strong practicability, high security and robustness in enduring common attacks

    時変システムのオンライン同定のための適応PSO

    No full text

    An Efficient Algorithm for Sensitively Detecting Circular RNA from RNA-seq Data

    No full text
    Circular RNA (circRNA) is an important member of non-coding RNA family. Numerous computational methods for detecting circRNAs from RNA-seq data have been developed in the past few years, but there are dramatic differences among the algorithms regarding the balancing of the sensitivity and precision of the detection and filtering strategies. To further improve the sensitivity, while maintaining an acceptable precision of circRNA detection, a novel and efficient de novo detection algorithm, CIRCPlus, is proposed in this paper. CIRCPlus accurately locates circRNA candidates by identifying a set of back-spliced junction reads by comparing the local similar sequence of each pair of spanning junction reads. This strategy, thus, utilizes the important information provided by unbalanced spanning reads, which facilitates the detection especially when the expression levels of circRNA are unapparent. The performance of CIRCPlus was tested and compared to the existing de novo methods on the real datasets as well as a series of simulation datasets with different configurations. The experiment results demonstrated that the sensitivities of CIRCPlus were able to reach 90% in common simulation settings, while CIRCPlus held balanced sensitivity and reliability on the real datasets according to an objective assessment criteria based on RNase R-treated samples. The software tool is available for academic uses only
    corecore