The use of networks to integrate different genetic, proteomic, and metabolic
datasets has been proposed as a viable path toward elucidating the origins of
specific diseases. Here we introduce a new phenotypic database summarizing
correlations obtained from the disease history of more than 30 million patients
in a Phenotypic Disease Network (PDN). We present evidence that the structure
of the PDN is relevant to the understanding of illness progression by showing
that (1) patients develop diseases close in the network to those they already
have; (2) the progression of disease along the links of the network is
different for patients of different genders and ethnicities; (3) patients
diagnosed with diseases which are more highly connected in the PDN tend to die
sooner than those affected by less connected diseases; and (4) diseases that
tend to be preceded by others in the PDN tend to be more connected than
diseases that precede other illnesses, and are associated with higher degrees
of mortality. Our findings show that disease progression can be represented and
studied using network methods, offering the potential to enhance our
understanding of the origin and evolution of human diseases. The dataset
introduced here, released concurrently with this publication, represents the
largest relational phenotypic resource publicly available to the research
community.Comment: 28 pages (double space), 6 figure