Search CORE

31 research outputs found

Record-Linkage from a Technical Point of View

Author: Rainer Schnell
Publication venue
Publication date
Field of study

TRecord linkage is used for preparing sampling frames, deduplication of lists and combining information on the same object from two different databases. If the identifiers of the same objects in two different databases have error free unique common identifiers like personal identification numbers (PID), record linkage is a simple file merge operation. If the identifiers contains errors, record linkage is a challenging task. In many applications, the files have widely different numbers of observations, for example a few thousand records of a sample survey and a few million records of an administrative database of social security numbers. Available software, privacy issues and future research topics are discussed.Record-Linkage, Data-mining, Privacy preserving protocols

Research Papers in Economics

The relation between Pearson's correlation coefficient r and Salton's cosine measure

Author: Egghe Leo
Leydesdorff Loet
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

The relation between Pearson's correlation coefficient and Salton's cosine measure is revealed based on the different possible values of the division of the L1-norm and the L2-norm of a vector. These different values yield a sheaf of increasingly straight lines which form together a cloud of points, being the investigated relation. The theoretical results are tested against the author co-citation relations among 24 informetricians for whom two matrices can be constructed, based on co-citations: the asymmetric occurrence matrix and the symmetric co-citation matrix. Both examples completely confirm the theoretical results. The results enable us to specify an algorithm which provides a threshold value for the cosine above which none of the corresponding Pearson correlations would be negative. Using this threshold value can be expected to optimize the visualization of the vector space

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

GraphMaps: Browsing Large Graphs as Interactive Maps

Author: Chen Xiaoji
Holroyd Alexander E.
Lee Bongshin
Nachmanson Lev
Prutkin Roman
Riche Nathalie Henry
Publication venue
Publication date: 01/01/2015
Field of study

Algorithms for laying out large graphs have seen significant progress in the past decade. However, browsing large graphs remains a challenge. Rendering thousands of graphical elements at once often results in a cluttered image, and navigating these elements naively can cause disorientation. To address this challenge we propose a method called GraphMaps, mimicking the browsing experience of online geographic maps. GraphMaps creates a sequence of layers, where each layer refines the previous one. During graph browsing, GraphMaps chooses the layer corresponding to the zoom level, and renders only those entities of the layer that intersect the current viewport. The result is that, regardless of the graph size, the number of entities rendered at each view does not exceed a predefined threshold, yet all graph elements can be explored by the standard zoom and pan operations. GraphMaps preprocesses a graph in such a way that during browsing, the geometry of the entities is stable, and the viewer is responsive. Our case studies indicate that GraphMaps is useful in gaining an overview of a large graph, and also in exploring a graph on a finer level of detail.Comment: submitted to GD 201

arXiv.org e-Print Archive

Graph Drawing E-print Archive

Record-linkage from a technical point of view

Author: Schnell Rainer
Publication venue: 'Botanic Garden & Botanical Museum Berlin-Dahlem BGBM'
Publication date: 14/01/2014
Field of study

"Record linkage is used for preparing sampling frames, deduplication of lists and combining information on the same object from two different databases. If the identifiers of the same objects in two different databases have error free unique common identifiers like personal identification numbers (PID), record linkage is a simple file merge operation. If the identifiers contain errors, record linkage is a challenging task. In many applications, the files have widely different numbers of observations, for example a few thousand records of a sample survey and a few million records of an administrative database of social security numbers. Available software, privacy issues and future research topics are discussed." [author's abstract

SSOAR - Social Science Open Access Repository

Animating the development of Social Networks over time using a dynamic extension of multidimensional scaling

Author: De Nooy Wouter
Leydesdorff Loet
Schank Thomas
Scharnhorst Andrea
Publication venue
Publication date: 01/01/2008
Field of study

The animation of network visualizations poses technical and theoretical challenges. Rather stable patterns are required before the mental map enables a user to make inferences over time. In order to enhance stability, we developed an extension of stress-minimization with developments over time. This dynamic layouter is no longer based on linear interpolation between independent static visualizations, but change over time is used as a parameter in the optimization. Because of our focus on structural change versus stability the attention is shifted from the relational graph to the latent eigenvectors of matrices. The approach is illustrated with animations for the journal citation environments of Social Networks, the (co-)author networks in the carrying community of this journal, and the topical development using relations among its title words. Our results are also compared with animations based on PajekToSVGAnim and SoNIA

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Scipedia

International Migration, Integration and Social Cohesion online publications

UvA-DARE