2,010 research outputs found
Summarisation of weighted networks
Networks often contain implicit structure. We introduce novel problems and methods that look for structure in networks, by grouping nodes into supernodes and edges to superedges, and then make this structure visible to the user in a smaller generalised network. This task of finding generalisations of nodes and edges is formulated as network Summarisation'. We propose models and algorithms for networks that have weights on edges, on nodes or on both, and study three new variants of the network summarisation problem. In edge-based weighted network summarisation, the summarised network should preserve edge weights as well as possible. A wider class of settings is considered in path-based weighted network summarisation, where the resulting summarised network should preserve longer range connectivities between nodes. Node-based weighted network summarisation in turn allows weights also on nodes and summarisation aims to preserve more information related to high weight nodes. We study theoretical properties of these problems and show them to be NP-hard. We propose a range of heuristic generalisation algorithms with different trade-offs between complexity and quality of the result. Comprehensive experiments on real data show that weighted networks can be summarised efficiently with relatively little error.Peer reviewe
A Supervised Approach to Extractive Summarisation of Scientific Papers
Automatic summarisation is a popular approach to reduce a document to its
main arguments. Recent research in the area has focused on neural approaches to
summarisation, which can be very data-hungry. However, few large datasets exist
and none for the traditionally popular domain of scientific publications, which
opens up challenging research avenues centered on encoding large, complex
documents. In this paper, we introduce a new dataset for summarisation of
computer science publications by exploiting a large resource of author provided
summaries and show straightforward ways of extending it further. We develop
models on the dataset making use of both neural sentence encoding and
traditionally used summarisation features and show that models which encode
sentences as well as their local and global context perform best, significantly
outperforming well-established baseline methods.Comment: 11 pages, 6 figure
Comparison of the language networks from literature and blogs
In this paper we present the comparison of the linguistic networks from
literature and blog texts. The linguistic networks are constructed from texts
as directed and weighted co-occurrence networks of words. Words are nodes and
links are established between two nodes if they are directly co-occurring
within the sentence. The comparison of the networks structure is performed at
global level (network) in terms of: average node degree, average shortest path
length, diameter, clustering coefficient, density and number of components.
Furthermore, we perform analysis on the local level (node) by comparing the
rank plots of in and out degree, strength and selectivity. The
selectivity-based results point out that there are differences between the
structure of the networks constructed from literature and blogs
Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network
Capturing the compositional process which maps the meaning of words to that
of documents is a central challenge for researchers in Natural Language
Processing and Information Retrieval. We introduce a model that is able to
represent the meaning of documents by embedding them in a low dimensional
vector space, while preserving distinctions of word and sentence order crucial
for capturing nuanced semantics. Our model is based on an extended Dynamic
Convolution Neural Network, which learns convolution filters at both the
sentence and document level, hierarchically learning to capture and compose low
level lexical features into high level semantic concepts. We demonstrate the
effectiveness of this model on a range of document modelling tasks, achieving
strong results with no feature engineering and with a more compact model.
Inspired by recent advances in visualising deep convolution networks for
computer vision, we present a novel visualisation technique for our document
networks which not only provides insight into their learning process, but also
can be interpreted to produce a compelling automatic summarisation system for
texts
Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy
In this paper we shall consider the problem of deploying attention to subsets
of the video streams for collating the most relevant data and information of
interest related to a given task. We formalize this monitoring problem as a
foraging problem. We propose a probabilistic framework to model observer's
attentive behavior as the behavior of a forager. The forager, moment to moment,
focuses its attention on the most informative stream/camera, detects
interesting objects or activities, or switches to a more profitable stream. The
approach proposed here is suitable to be exploited for multi-stream video
summarization. Meanwhile, it can serve as a preliminary step for more
sophisticated video surveillance, e.g. activity and behavior analysis.
Experimental results achieved on the UCR Videoweb Activities Dataset, a
publicly available dataset, are presented to illustrate the utility of the
proposed technique.Comment: Accepted to IEEE Transactions on Image Processin
Extractive text summarisation using graph triangle counting approach: proposed method
Currently, with a growing quantity of automated text data, the necessity for the con-struction of Summarisation systems turns out to be vital. Summarisation systems confine and condense the mainly vital ideas of the papers and assist the user to find and understand the foremost facts of the text quicker and easier from the dispensation of information. Compelling set of such systems are those that create summaries of ex-tracts. This type of summary, which is called Extractive Summarisation , is created by choosing large significant fragments of the text without making any amendment to the original. One methodology for generating this type of summary is consuming the graph theory. In graph theory there is one field called graph pruning / reduction, which means, to find the best representation of the main graph with a smaller number of nodes and edges. In this paper, a graph reduction technique called the triangle counting approach is presented to choose the most vital sentences of the text. The first phase is to represent a text as a graph, where nodes are the sentences and edges are the similarity between the sentences. The second phase is to construct the triangles, after that bit vector representation and the final phase is to retrieve the sentences based on the values of bit vector
Automatic summarisation of Instagram social network posts Combining semantic and statistical approaches
The proliferation of data and text documents such as articles, web pages,
books, social network posts, etc. on the Internet has created a fundamental
challenge in various fields of text processing under the title of "automatic
text summarisation". Manual processing and summarisation of large volumes of
textual data is a very difficult, expensive, time-consuming and impossible
process for human users. Text summarisation systems are divided into extractive
and abstract categories. In the extractive summarisation method, the final
summary of a text document is extracted from the important sentences of the
same document without any modification. In this method, it is possible to
repeat a series of sentences and to interfere with pronouns. However, in the
abstract summarisation method, the final summary of a textual document is
extracted from the meaning and significance of the sentences and words of the
same document or other documents. Many of the works carried out have used
extraction methods or abstracts to summarise the collection of web documents,
each of which has advantages and disadvantages in the results obtained in terms
of similarity or size. In this work, a crawler has been developed to extract
popular text posts from the Instagram social network with appropriate
preprocessing, and a set of extraction and abstraction algorithms have been
combined to show how each of the abstraction algorithms can be used.
Observations made on 820 popular text posts on the social network Instagram
show the accuracy (80%) of the proposed system
- …