128,993 research outputs found

    Hierarchical Multiresolution Feature- and Prior-based Graphs for Classification

    Full text link
    To incorporate spatial (neighborhood) and bidirectional hierarchical relationships as well as features and priors of the samples into their classification, we formulated the classification problem on three variants of multiresolution neighborhood graphs and the graph of a hierarchical conditional random field. Each of these graphs was weighted and undirected and could thus incorporate the spatial or hierarchical relationships in all directions. In addition, each variant of the proposed neighborhood graphs was composed of a spatial feature-based subgraph and an aspatial prior-based subgraph. It expanded on a random walker graph by using novel mechanisms to derive the edge weights of its spatial feature-based subgraph. These mechanisms included implicit and explicit edge detection to enhance detection of weak boundaries between different classes in spatial domain. The implicit edge detection relied on the outlier detection capability of the Tukey's function and the classification reliabilities of the samples estimated by a hierarchical random forest classifier. Similar mechanism was used to derive the edge weights and thus the energy function of the hierarchical conditional random field. This way, the classification problem boiled down to a system of linear equations and a minimization of the energy function which could be done via fast and efficient techniques

    Describing and Understanding Neighborhood Characteristics through Online Social Media

    Full text link
    Geotagged data can be used to describe regions in the world and discover local themes. However, not all data produced within a region is necessarily specifically descriptive of that area. To surface the content that is characteristic for a region, we present the geographical hierarchy model (GHM), a probabilistic model based on the assumption that data observed in a region is a random mixture of content that pertains to different levels of a hierarchy. We apply the GHM to a dataset of 8 million Flickr photos in order to discriminate between content (i.e., tags) that specifically characterizes a region (e.g., neighborhood) and content that characterizes surrounding areas or more general themes. Knowledge of the discriminative and non-discriminative terms used throughout the hierarchy enables us to quantify the uniqueness of a given region and to compare similar but distant regions. Our evaluation demonstrates that our model improves upon traditional Naive Bayes classification by 47% and hierarchical TF-IDF by 27%. We further highlight the differences and commonalities with human reasoning about what is locally characteristic for a neighborhood, distilled from ten interviews and a survey that covered themes such as time, events, and prior regional knowledgeComment: Accepted in WWW 2015, 2015, Florence, Ital

    Investigating Extensions to Random Walk Based Graph Embedding

    Full text link
    Graph embedding has recently gained momentum in the research community, in particular after the introduction of random walk and neural network based approaches. However, most of the embedding approaches focus on representing the local neighborhood of nodes and fail to capture the global graph structure, i.e. to retain the relations to distant nodes. To counter that problem, we propose a novel extension to random walk based graph embedding, which removes a percentage of least frequent nodes from the walks at different levels. By this removal, we simulate farther distant nodes to reside in the close neighborhood of a node and hence explicitly represent their connection. Besides the common evaluation tasks for graph embeddings, such as node classification and link prediction, we evaluate and compare our approach against related methods on shortest path approximation. The results indicate, that extensions to random walk based methods (including our own) improve the predictive performance only slightly - if at all

    TPM: Transition probability matrix - Graph structural feature based embedding

    Get PDF
    summary:In this work, Transition Probability Matrix (TPM) is proposed as a new method for extracting the features of nodes in the graph. The proposed method uses random walks to capture the connectivity structure of a node's close neighborhood. The information obtained from random walks is converted to anonymous walks to extract the topological features of nodes. In the embedding process of nodes, anonymous walks are used since they capture the topological similarities of connectivities better than random walks. Therefore the obtained embedding vectors have richer information about the underlying connectivity structure. The method is applied to node classification and link prediction tasks. The performance of the proposed algorithm is superior to the state-of-the-art algorithms in the recent literature. Moreover, the extracted information about the connectivity structure of similar networks is used to link prediction and node classification tasks for a completely new graph
    • …
    corecore