3,568 research outputs found

    Toward Entity-Aware Search

    Get PDF
    As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability

    Integrated Proteotranscriptomics of Breast Cancer Reveals Globally Increased Protein-mRNA Concordance Associated with Subtypes and Survival

    Get PDF
    BACKGROUND: Transcriptome analysis of breast cancer discovered distinct disease subtypes of clinical significance. However, it remains a challenge to define disease biology solely based on gene expression because tumor biology is often the result of protein function. Here, we measured global proteome and transcriptome expression in human breast tumors and adjacent non-cancerous tissue and performed an integrated proteotranscriptomic analysis. METHODS: We applied a quantitative liquid chromatography/mass spectrometry-based proteome analysis using an untargeted approach and analyzed protein extracts from 65 breast tumors and 53 adjacent non-cancerous tissues. Additional gene expression data from Affymetrix Gene Chip Human Gene ST Arrays were available for 59 tumors and 38 non-cancerous tissues in our study. We then applied an integrated analysis of the proteomic and transcriptomic data to examine relationships between them, disease characteristics, and patient survival. Findings were validated in a second dataset using proteome and transcriptome data from The Cancer Genome Atlas and the Clinical Proteomic Tumor Analysis Consortium. RESULTS: We found that the proteome describes differences between cancerous and non-cancerous tissues that are not revealed by the transcriptome. The proteome, but not the transcriptome, revealed an activation of infection-related signal pathways in basal-like and triple-negative tumors. We also observed that proteins rather than mRNAs are increased in tumors and show that this observation could be related to shortening of the 3\u27 untranslated region of mRNAs in tumors. The integrated analysis of the two technologies further revealed a global increase in protein-mRNA concordance in tumors. Highly correlated protein-gene pairs were enriched in protein processing and disease metabolic pathways. The increased concordance between transcript and protein levels was additionally associated with aggressive disease, including basal-like/triple-negative tumors, and decreased patient survival. We also uncovered a strong positive association between protein-mRNA concordance and proliferation of tumors. Finally, we observed that protein expression profiles co-segregate with a Myc activation signature and separate breast tumors into two subgroups with different survival outcomes. CONCLUSIONS: Our study provides new insights into the relationship between protein and mRNA expression in breast cancer and shows that an integrated analysis of the proteome and transcriptome has the potential of uncovering novel disease characteristics

    構造化データに対する予測手法:グラフ,順序,時系列

    Get PDF
    京都大学新制・課程博士博士(情報学)甲第23439号情博第769号新制||情||131(附属図書館)京都大学大学院情報学研究科知能情報学専攻(主査)教授 鹿島 久嗣, 教授 山本 章博, 教授 阿久津 達也学位規則第4条第1項該当Doctor of InformaticsKyoto UniversityDFA

    Identification of Novel Cancer-Related Genes with a Prognostic Role Using Gene Expression and Protein-Protein Interaction Network Data

    Get PDF
    Early cancer diagnosis and prognosis prediction are necessary for cancer patients. Effective identification of cancer-related genes and biomarkers and survival prediction for cancer patients would facilitate personalized treatment of cancer patients. This study aimed to investigate a method for integrating data regarding gene expression and protein-protein interaction networks to identify cancer-related prognostic genes via random walk with restart algorithm and survival analysis. Known cancer-related genes in protein-protein interaction networks were considered seed genes, and the random walk algorithm was used to identify candidate cancer-related genes. Thereafter, using the univariant Cox regression model, gene expression data were screened to identify survival-related genes. Furthermore, candidate genes and survival-related genes were screened to identify cancer-related prognostic genes. Finally, the effectiveness of the method was verified through gene function analysis and survival prediction. The results indicate that the cancer-related genes can be considered prognostic cancer biomarkers and provide a basis for cancer diagnosis
    corecore