212,368 research outputs found

    Large-scale structure of time evolving citation networks

    Full text link
    In this paper we examine a number of methods for probing and understanding the large-scale structure of networks that evolve over time. We focus in particular on citation networks, networks of references between documents such as papers, patents, or court cases. We describe three different methods of analysis, one based on an expectation-maximization algorithm, one based on modularity optimization, and one based on eigenvector centrality. Using the network of citations between opinions of the United States Supreme Court as an example, we demonstrate how each of these methods can reveal significant structural divisions in the network, and how, ultimately, the combination of all three can help us develop a coherent overall picture of the network's shape.Comment: 10 pages, 6 figures; journal names for 4 references fixe

    Search Efficient Binary Network Embedding

    Full text link
    Traditional network embedding primarily focuses on learning a dense vector representation for each node, which encodes network structure and/or node content information, such that off-the-shelf machine learning algorithms can be easily applied to the vector-format node representations for network analysis. However, the learned dense vector representations are inefficient for large-scale similarity search, which requires to find the nearest neighbor measured by Euclidean distance in a continuous vector space. In this paper, we propose a search efficient binary network embedding algorithm called BinaryNE to learn a sparse binary code for each node, by simultaneously modeling node context relations and node attribute relations through a three-layer neural network. BinaryNE learns binary node representations efficiently through a stochastic gradient descent based online learning algorithm. The learned binary encoding not only reduces memory usage to represent each node, but also allows fast bit-wise comparisons to support much quicker network node search compared to Euclidean distance or other distance measures. Our experiments and comparisons show that BinaryNE not only delivers more than 23 times faster search speed, but also provides comparable or better search quality than traditional continuous vector based network embedding methods

    Prediction of Emerging Technologies Based on Analysis of the U.S. Patent Citation Network

    Full text link
    The network of patents connected by citations is an evolving graph, which provides a representation of the innovation process. A patent citing another implies that the cited patent reflects a piece of previously existing knowledge that the citing patent builds upon. A methodology presented here (i) identifies actual clusters of patents: i.e. technological branches, and (ii) gives predictions about the temporal changes of the structure of the clusters. A predictor, called the {citation vector}, is defined for characterizing technological development to show how a patent cited by other patents belongs to various industrial fields. The clustering technique adopted is able to detect the new emerging recombinations, and predicts emerging new technology clusters. The predictive ability of our new method is illustrated on the example of USPTO subcategory 11, Agriculture, Food, Textiles. A cluster of patents is determined based on citation data up to 1991, which shows significant overlap of the class 442 formed at the beginning of 1997. These new tools of predictive analytics could support policy decision making processes in science and technology, and help formulate recommendations for action

    Communities, Knowledge Creation, and Information Diffusion

    Get PDF
    In this paper, we examine how patterns of scientific collaboration contribute to knowledge creation. Recent studies have shown that scientists can benefit from their position within collaborative networks by being able to receive more information of better quality in a timely fashion, and by presiding over communication between collaborators. Here we focus on the tendency of scientists to cluster into tightly-knit communities, and discuss the implications of this tendency for scientific performance. We begin by reviewing a new method for finding communities, and we then assess its benefits in terms of computation time and accuracy. While communities often serve as a taxonomic scheme to map knowledge domains, they also affect how successfully scientists engage in the creation of new knowledge. By drawing on the longstanding debate on the relative benefits of social cohesion and brokerage, we discuss the conditions that facilitate collaborations among scientists within or across communities. We show that successful scientific production occurs within communities when scientists have cohesive collaborations with others from the same knowledge domain, and across communities when scientists intermediate among otherwise disconnected collaborators from different knowledge domains. We also discuss the implications of communities for information diffusion, and show how traditional epidemiological approaches need to be refined to take knowledge heterogeneity into account and preserve the system's ability to promote creative processes of novel recombinations of idea

    Development of Computer Science Disciplines - A Social Network Analysis Approach

    Full text link
    In contrast to many other scientific disciplines, computer science considers conference publications. Conferences have the advantage of providing fast publication of papers and of bringing researchers together to present and discuss the paper with peers. Previous work on knowledge mapping focused on the map of all sciences or a particular domain based on ISI published JCR (Journal Citation Report). Although this data covers most of important journals, it lacks computer science conference and workshop proceedings. That results in an imprecise and incomplete analysis of the computer science knowledge. This paper presents an analysis on the computer science knowledge network constructed from all types of publications, aiming at providing a complete view of computer science research. Based on the combination of two important digital libraries (DBLP and CiteSeerX), we study the knowledge network created at journal/conference level using citation linkage, to identify the development of sub-disciplines. We investigate the collaborative and citation behavior of journals/conferences by analyzing the properties of their co-authorship and citation subgraphs. The paper draws several important conclusions. First, conferences constitute social structures that shape the computer science knowledge. Second, computer science is becoming more interdisciplinary. Third, experts are the key success factor for sustainability of journals/conferences
    • …
    corecore