44 research outputs found
DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases
Keyphrase extraction from documents is useful to a variety of applications
such as information retrieval and document summarization. This paper presents
an end-to-end method called DivGraphPointer for extracting a set of diversified
keyphrases from a document. DivGraphPointer combines the advantages of
traditional graph-based ranking methods and recent neural network-based
approaches. Specifically, given a document, a word graph is constructed from
the document based on word proximity and is encoded with graph convolutional
networks, which effectively capture document-level word salience by modeling
long-range dependency between words in the document and aggregating multiple
appearances of identical words into one node. Furthermore, we propose a
diversified point network to generate a set of diverse keyphrases out of the
word graph in the decoding process. Experimental results on five benchmark data
sets show that our proposed method significantly outperforms the existing
state-of-the-art approaches.Comment: Accepted to SIGIR 201
Hardware Acceleration for Similarity Measurement in Natural Language Processing
Abstract-The continuation of Moore's law scaling, but in the absence of Dennard scaling, motivates an emphasis on energyefficient accelerator-based designs for future applications. In natural language processing, the conventional approach to automatically analyze vast text collections-using scale-out processingincurs high energy and hardware costs since the central computeintensive step of similarity measurement often entails pair-wise, allto-all comparisons. We propose a custom hardware accelerator for similarity measures that leverages data streaming, memory latency hiding, and parallel computation across variable-length threads. We evaluate our design through a combination of architectural simulation and RTL synthesis. When executing the dominant kernel in a semantic indexing application for documents, we demonstrate throughput gains of up to 42Ă— and 58Ă— lower energy per similaritycomputation compared to an optimized software implementation, while requiring less than 1.3% of the area of a conventional core
Using Collective Discourse to Generate Surveys of Scientific Paradigms.
This thesis is focused on understanding collective discourse and employing its properties to build better decision support systems.
We first define collective discourse as a collective human behavior in content generation. In social media, collective discourse is often a collective
reaction to an event. A collective reaction to a well-defined subject emerges in response to an event (a movie release, a breaking story, a newly published paper) in the form of independent writings (movie reviews, news headlines, citation sentences) by many individuals.
In order to understand collective discourse, we perform our analysis on a wide range of real-world datasets from citations to movie reviews.
We show that all these datasets exhibit diversity of perspective, a property seen in other collective systems and a criterion in wise crowds. Our experiments also confirm that the network of different perspective co-occurrences exhibits the small-world property with high clustering of different perspectives. Finally, we show that non-expert contributions in collective discourse can be used to answer simple questions that are otherwise hard to answer.
As a concrete example of collective discourse, we discuss citations to scholarly work. We show how they contain important
information that convey the key features and basic underpinnings of a
particular field, early and late developments, important
contributions, and basic definitions and examples that enable rapid
understanding of a field by non-experts. We then present
C-LexRank, a system that exploits scientific collective discourse to
produce automatically generated, readily consumable technical surveys.
Finally, we further extend our experiments to summarize an entire scientific topic.
We generate extractive surveys of a set of Question Answering (QA) and Dependency Parsing (DP) papers, their abstracts, and their citation sentences and show that citations have unique survey-worthy information.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/95960/1/vahed_1.pd
Abstract
Blogs form a large social network, and their analysis are becoming an important research area today. Blogs are growing rapidly in the Internet, because bloggers can rapidly change the content and linking patterns of them. Visitors of blogs may comment on the postings of a blog, and this leads to a complex interaction between groups of bloggers. One of the interesting phenomenon in blog space is “blogger failure” when a blogger stops writing after a certain amount of time and will not return to blogspace for a long time, or when a blogger does get any comment from her audience. In this paper we illustrate our observation on bloggers failure in a unique blogspace. First, we introduce, PersianBlog blogspace and dataset, and then we will describe our observations in commenting behaviors of bloggers. Finally, we will provide our definition of failure, and give a broad future research path to bring out a model for this phenomenon