2,010 research outputs found

    Summarisation of weighted networks

    Get PDF
    Networks often contain implicit structure. We introduce novel problems and methods that look for structure in networks, by grouping nodes into supernodes and edges to superedges, and then make this structure visible to the user in a smaller generalised network. This task of finding generalisations of nodes and edges is formulated as network Summarisation'. We propose models and algorithms for networks that have weights on edges, on nodes or on both, and study three new variants of the network summarisation problem. In edge-based weighted network summarisation, the summarised network should preserve edge weights as well as possible. A wider class of settings is considered in path-based weighted network summarisation, where the resulting summarised network should preserve longer range connectivities between nodes. Node-based weighted network summarisation in turn allows weights also on nodes and summarisation aims to preserve more information related to high weight nodes. We study theoretical properties of these problems and show them to be NP-hard. We propose a range of heuristic generalisation algorithms with different trade-offs between complexity and quality of the result. Comprehensive experiments on real data show that weighted networks can be summarised efficiently with relatively little error.Peer reviewe

    A Supervised Approach to Extractive Summarisation of Scientific Papers

    Get PDF
    Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on neural approaches to summarisation, which can be very data-hungry. However, few large datasets exist and none for the traditionally popular domain of scientific publications, which opens up challenging research avenues centered on encoding large, complex documents. In this paper, we introduce a new dataset for summarisation of computer science publications by exploiting a large resource of author provided summaries and show straightforward ways of extending it further. We develop models on the dataset making use of both neural sentence encoding and traditionally used summarisation features and show that models which encode sentences as well as their local and global context perform best, significantly outperforming well-established baseline methods.Comment: 11 pages, 6 figure

    Comparison of the language networks from literature and blogs

    Full text link
    In this paper we present the comparison of the linguistic networks from literature and blog texts. The linguistic networks are constructed from texts as directed and weighted co-occurrence networks of words. Words are nodes and links are established between two nodes if they are directly co-occurring within the sentence. The comparison of the networks structure is performed at global level (network) in terms of: average node degree, average shortest path length, diameter, clustering coefficient, density and number of components. Furthermore, we perform analysis on the local level (node) by comparing the rank plots of in and out degree, strength and selectivity. The selectivity-based results point out that there are differences between the structure of the networks constructed from literature and blogs

    Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

    Full text link
    Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced semantics. Our model is based on an extended Dynamic Convolution Neural Network, which learns convolution filters at both the sentence and document level, hierarchically learning to capture and compose low level lexical features into high level semantic concepts. We demonstrate the effectiveness of this model on a range of document modelling tasks, achieving strong results with no feature engineering and with a more compact model. Inspired by recent advances in visualising deep convolution networks for computer vision, we present a novel visualisation technique for our document networks which not only provides insight into their learning process, but also can be interpreted to produce a compelling automatic summarisation system for texts

    Long-term behavioural change detection through pervasive sensing

    Get PDF

    Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy

    Full text link
    In this paper we shall consider the problem of deploying attention to subsets of the video streams for collating the most relevant data and information of interest related to a given task. We formalize this monitoring problem as a foraging problem. We propose a probabilistic framework to model observer's attentive behavior as the behavior of a forager. The forager, moment to moment, focuses its attention on the most informative stream/camera, detects interesting objects or activities, or switches to a more profitable stream. The approach proposed here is suitable to be exploited for multi-stream video summarization. Meanwhile, it can serve as a preliminary step for more sophisticated video surveillance, e.g. activity and behavior analysis. Experimental results achieved on the UCR Videoweb Activities Dataset, a publicly available dataset, are presented to illustrate the utility of the proposed technique.Comment: Accepted to IEEE Transactions on Image Processin

    Extractive text summarisation using graph triangle counting approach: proposed method

    Get PDF
    Currently, with a growing quantity of automated text data, the necessity for the con-struction of Summarisation systems turns out to be vital. Summarisation systems confine and condense the mainly vital ideas of the papers and assist the user to find and understand the foremost facts of the text quicker and easier from the dispensation of information. Compelling set of such systems are those that create summaries of ex-tracts. This type of summary, which is called Extractive Summarisation , is created by choosing large significant fragments of the text without making any amendment to the original. One methodology for generating this type of summary is consuming the graph theory. In graph theory there is one field called graph pruning / reduction, which means, to find the best representation of the main graph with a smaller number of nodes and edges. In this paper, a graph reduction technique called the triangle counting approach is presented to choose the most vital sentences of the text. The first phase is to represent a text as a graph, where nodes are the sentences and edges are the similarity between the sentences. The second phase is to construct the triangles, after that bit vector representation and the final phase is to retrieve the sentences based on the values of bit vector

    Automatic summarisation of Instagram social network posts Combining semantic and statistical approaches

    Full text link
    The proliferation of data and text documents such as articles, web pages, books, social network posts, etc. on the Internet has created a fundamental challenge in various fields of text processing under the title of "automatic text summarisation". Manual processing and summarisation of large volumes of textual data is a very difficult, expensive, time-consuming and impossible process for human users. Text summarisation systems are divided into extractive and abstract categories. In the extractive summarisation method, the final summary of a text document is extracted from the important sentences of the same document without any modification. In this method, it is possible to repeat a series of sentences and to interfere with pronouns. However, in the abstract summarisation method, the final summary of a textual document is extracted from the meaning and significance of the sentences and words of the same document or other documents. Many of the works carried out have used extraction methods or abstracts to summarise the collection of web documents, each of which has advantages and disadvantages in the results obtained in terms of similarity or size. In this work, a crawler has been developed to extract popular text posts from the Instagram social network with appropriate preprocessing, and a set of extraction and abstraction algorithms have been combined to show how each of the abstraction algorithms can be used. Observations made on 820 popular text posts on the social network Instagram show the accuracy (80%) of the proposed system
    corecore