30,843 research outputs found

    Graph Summarization

    Full text link
    The continuous and rapid growth of highly interconnected datasets, which are both voluminous and complex, calls for the development of adequate processing and analytical techniques. One method for condensing and simplifying such datasets is graph summarization. It denotes a series of application-specific algorithms designed to transform graphs into more compact representations while preserving structural patterns, query answers, or specific property distributions. As this problem is common to several areas studying graph topologies, different approaches, such as clustering, compression, sampling, or influence detection, have been proposed, primarily based on statistical and optimization methods. The focus of our chapter is to pinpoint the main graph summarization methods, but especially to focus on the most recent approaches and novel research trends on this topic, not yet covered by previous surveys.Comment: To appear in the Encyclopedia of Big Data Technologie

    A behavioural model of the adoption and use of new telecommunications media: the effects of communication scenarios and media product/service attributes

    Get PDF
    Recent years have seen the dramatic growth of new modes of communication. Above and beyond using land line and mobile phone for voice real-time communication, people spend increasing amounts of time receiving and sending messages through social networks (e.g. Myspace or Facebook) and also through real-time communication software (e.g. Skype or MSN). As indicated by the significant decline on the amount of call volumes of land line and mobile phone during the period from 2000 to 2006 in UK and in Taiwan, we conjecture that consumers are transferring to these new forms of communication in order to satisfy their communication needs, diminishing the demand for established channels. The purpose of this research is to develop a behavioural model to analyse the perceived value and weight of the specific media attributes that drive people to adopt or use these new communication channels. Seven telecommunications media available in 2010 have been categorised in this research included land-line, mobile phone, short message service (SMS), E-mail, Internet telephony, instant messaging and social networking. Various media product/service attributes such as synchronicity, multi-tasking, price, quality, mobility, privacy and video which might affect the media choice of consumers were first identified. Importantly, this research has designed six types of communication scenarios in the online survey with 894 valid responses to clarify the effects of different communication aims, distinguish consumers' intended behaviours toward these telecommunications media. --Multi-attribute choice model,Telecommunications media,Communication scenario,New product adoption,Substitution effect,ICT forecasting

    Multi-Source Spatial Entity Linkage

    Get PDF
    Besides the traditional cartographic data sources, spatial information can also be derived from location-based sources. However, even though different location-based sources refer to the same physical world, each one has only partial coverage of the spatial entities, describe them with different attributes, and sometimes provide contradicting information. Hence, we introduce the spatial entity linkage problem, which finds which pairs of spatial entities belong to the same physical spatial entity. Our proposed solution (QuadSky) starts with a time-efficient spatial blocking technique (QuadFlex), compares pairwise the spatial entities in the same block, ranks the pairs using Pareto optimality with the SkyRank algorithm, and finally, classifies the pairs with our novel SkyEx-* family of algorithms that yield 0.85 precision and 0.85 recall for a manually labeled dataset of 1,500 pairs and 0.87 precision and 0.6 recall for a semi-manually labeled dataset of 777,452 pairs. Moreover, we provide a theoretical guarantee and formalize the SkyEx-FES algorithm that explores only 27% of the skylines without any loss in F-measure. Furthermore, our fully unsupervised algorithm SkyEx-D approximates the optimal result with an F-measure loss of just 0.01. Finally, QuadSky provides the best trade-off between precision and recall, and the best F-measure compared to the existing baselines and clustering techniques, and approximates the results of supervised learning solutions

    An Algorithm for Data Reorganization in a Multi-dimensional Index

    Get PDF
    In spatial databases, data are associated with spatial coordinates and are retrieved based on spatial proximity. A spatial database uses spatial indexes to optimize spatial queries. An essential ingredient for efficient spatial query processing is spatial clustering of data and reorganization of spatial data. Traditional clustering algorithms and reorganization utilities lack in performance and execution. To solve this problem we have developed an algorithm to convert a two dimensional spatial index into a single dimensional value and then a reorganization is done on the spatial data. This report describes this algorithm as well as various experiments to validate its effectiveness
    • …
    corecore