6,327 research outputs found

    Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure

    Full text link
    Big data research has attracted great attention in science, technology, industry and society. It is developing with the evolving scientific paradigm, the fourth industrial revolution, and the transformational innovation of technologies. However, its nature and fundamental challenge have not been recognized, and its own methodology has not been formed. This paper explores and answers the following questions: What is big data? What are the basic methods for representing, managing and analyzing big data? What is the relationship between big data and knowledge? Can we find a mapping from big data into knowledge space? What kind of infrastructure is required to support not only big data management and analysis but also knowledge discovery, sharing and management? What is the relationship between big data and science paradigm? What is the nature and fundamental challenge of big data computing? A multi-dimensional perspective is presented toward a methodology of big data computing.Comment: 59 page

    Generating indicative-informative summaries with SumUM

    Get PDF
    We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies

    Method for Aspect-Based Sentiment Annotation Using Rhetorical Analysis

    Full text link
    This paper fills a gap in aspect-based sentiment analysis and aims to present a new method for preparing and analysing texts concerning opinion and generating user-friendly descriptive reports in natural language. We present a comprehensive set of techniques derived from Rhetorical Structure Theory and sentiment analysis to extract aspects from textual opinions and then build an abstractive summary of a set of opinions. Moreover, we propose aspect-aspect graphs to evaluate the importance of aspects and to filter out unimportant ones from the summary. Additionally, the paper presents a prototype solution of data flow with interesting and valuable results. The proposed method's results proved the high accuracy of aspect detection when applied to the gold standard dataset

    Automatic Document Summarization Using Knowledge Based System

    Get PDF
    This dissertation describes a knowledge-based system to create abstractive summaries of documents by generalizing new concepts, detecting main topics and creating new sentences. The proposed system is built on the Cyc development platform that consists of the world’s largest knowledge base and one of the most powerful inference engines. The system is unsupervised and domain independent. Its domain knowledge is provided by the comprehensive ontology of common sense knowledge contained in the Cyc knowledge base. The system described in this dissertation generates coherent and topically related new sentences as a summary for a given document. It uses syntactic structure and semantic features of the given documents to fuse information. It makes use of the knowledge base as a source of domain knowledge. Furthermore, it uses the reasoning engine to generalize novel information. The proposed system consists of three main parts: knowledge acquisition, knowledge discovery, and knowledge representation. Knowledge acquisition derives syntactic structure of each sentence in the document and maps words and their syntactic relationships into Cyc knowledge base. Knowledge discovery abstracts novel concepts, not explicitly mentioned in the document by exploring the ontology of mapped concepts and derives main topics described in the document by clustering the concepts. Knowledge representation creates new English sentences to summarize main concepts and their relationships. The syntactic structure of the newly created sentences is extended beyond simple subject-predicate-object triplets by incorporating adjective and adverb modifiers. This structure allows the system to create sentences that are more complex. The proposed system was implemented and tested. Test results show that the system is capable of creating new sentences that include abstracted concepts not mentioned in the original document and is capable of combining information from different parts of the document text to compose a summary

    Automated construction and analysis of political networks via open government and media sources

    Get PDF
    We present a tool to generate real world political networks from user provided lists of politicians and news sites. Additional output includes visualizations, interactive tools and maps that allow a user to better understand the politicians and their surrounding environments as portrayed by the media. As a case study, we construct a comprehensive list of current Texas politicians, select news sites that convey a spectrum of political viewpoints covering Texas politics, and examine the results. We propose a ”Combined” co-occurrence distance metric to better reflect the relationship between two entities. A topic modeling technique is also proposed as a novel, automated way of labeling communities that exist within a politician’s ”extended” network.Peer ReviewedPostprint (author's final draft
    • 

    corecore