Skip to main content
Article thumbnail
Location of Repository

Analyzing the Family Tree

By  and E North Temple StMichael P. Jones and E North Temple St

Abstract

FamilySearch holds one of the largest collections of linked family history data in the world. Nearly one billion records of individuals, both deceased and living, have been recorded and placed together into a common tree (“The Family Tree”). The study of this ancestral relationship graph consists of the largest family history network ever analyzed. We have found a number of interesting properties in the network using common graph analysis techniques. We examine the topology of the graph by calculating the connected components within the graph. The total network consists of one giant component consisting of many millions of records plus millions of very small components. We also describe how this topology has changed over time. The paper further describes how an analysis of the strongly connected components and the graph’s diameter can be used to assess the quality of the data. Finally, we describe a heuristic algorithm to determine the “connectedness ” of our patrons and find that those who have logged into the system are significantly more connected than those that have not. One third of the potential users are connected to the giant component while 80 % of the active users are. We discuss how this analysis could potentially be used to partition the graph to support scaling or distributing the system

Topics: General Terms Family History, Relationship Graph
Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.352.4948
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://fht.byu.edu/prev_worksh... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.