Mining latent entity structures from massive unstructured and interconnected data

Abstract

The “big data” era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone’s daily life. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured but interconnected data. Mining latent structured information around entities uncovers semantic structures from massive unstructured data and hence enables many high-impact applications, including taxonomy or knowledge base construction, multi-dimensional data analysis and information or social network analysis. A mining framework is proposed, to solve and integrate a chain of tasks: hierarchical topic discovery, topical phrase mining, entity role analysis and entity relation mining. It reveals two main forms of structures: topical and relational structures. The topical structure summarizes the topics associated with entities with various granularity, such as the research areas in computer science. The framework enables recursive construction of phrase-represented and entity-enriched topic hierarchy from text-attached information networks. It makes breakthrough in terms of quality and computational efficiency. The relational structure recovers the hidden relationship among entities, such as advisor-advisee. A probabilistic graphical modeling approach is proposed. The method can utilize heterogeneous attributes and links to capture all kinds of semantic signals, including constraints and dependencies, to recover the hierarchical relationship with the best known accuracy.Item withdrawn by Mark Zulauf ([email protected]) on 2014-09-24T19:45:04Z Item was in collections: University of Illinois Theses & Dissertations (ID: 1) No. of bitstreams: 1 Wang_Chi.pdf: 2960403 bytes, checksum: 8fe22fd3207c649b4d4b781197c0219a (MD5)Made available in DSpace on 2015-01-21T19:55:04Z (GMT). No. of bitstreams: 1 Chi_Wang.pdf: 2960403 bytes, checksum: 8fe22fd3207c649b4d4b781197c0219a (MD5)Embargo set by: Seth Robbins for item 73156 Lift date: 2017-01-21T19:56:18Z Reason: Author requested U of Illinois access only (OA after 2yrs) in Vireo ETD systemU of I Only Restriction Lifted for Item 73156 on 2017-01-22T10:15:13Z

Similar works

Full text

thumbnail-image

Illinois Digital Environment for Access to Learning and Scholarship Repository

redirect
Last time updated on 26/05/2015

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.