388 research outputs found
Data Mining of Online Genealogy Datasets for Revealing Lifespan Patterns in Human Population
Online genealogy datasets contain extensive information about millions of
people and their past and present family connections. This vast amount of data
can assist in identifying various patterns in human population. In this study,
we present methods and algorithms which can assist in identifying variations in
lifespan distributions of human population in the past centuries, in detecting
social and genetic features which correlate with human lifespan, and in
constructing predictive models of human lifespan based on various features
which can easily be extracted from genealogy datasets.
We have evaluated the presented methods and algorithms on a large online
genealogy dataset with over a million profiles and over 9 million connections,
all of which were collected from the WikiTree website. Our findings indicate
that significant but small positive correlations exist between the parents'
lifespan and their children's lifespan. Additionally, we found slightly higher
and significant correlations between the lifespans of spouses. We also
discovered a very small positive and significant correlation between longevity
and reproductive success in males, and a small and significant negative
correlation between longevity and reproductive success in females. Moreover,
our machine learning algorithms presented better than random classification
results in predicting which people who outlive the age of 50 will also outlive
the age of 80.
We believe that this study will be the first of many studies which utilize
the wealth of data on human populations, existing in online genealogy datasets,
to better understand factors which influence human lifespan. Understanding
these factors can assist scientists in providing solutions for successful
aging
Quantitative Analysis of Genealogy Using Digitised Family Trees
Driven by the popularity of television shows such as Who Do You Think You
Are? many millions of users have uploaded their family tree to web projects
such as WikiTree. Analysis of this corpus enables us to investigate genealogy
computationally. The study of heritage in the social sciences has led to an
increased understanding of ancestry and descent but such efforts are hampered
by difficult to access data. Genealogical research is typically a tedious
process involving trawling through sources such as birth and death
certificates, wills, letters and land deeds. Decades of research have developed
and examined hypotheses on population sex ratios, marriage trends, fertility,
lifespan, and the frequency of twins and triplets. These can now be tested on
vast datasets containing many billions of entries using machine learning tools.
Here we survey the use of genealogy data mining using family trees dating back
centuries and featuring profiles on nearly 7 million individuals based in over
160 countries. These data are not typically created by trained genealogists and
so we verify them with reference to third party censuses. We present results on
a range of aspects of population dynamics. Our approach extends the boundaries
of genealogy inquiry to precise measurement of underlying human phenomena
PREDICTION OF LIFE EXPECTANCY FOR ASIAN POPULATION USING MACHINE LEARNING ALGORITHMS
Predicting life expectancy has become more important nowadays as life has become more vulnerable due to many factors, including social, economic, environmental, education, lifestyle, and health condition. A lot of studies on life expectancy have been carried out. However, studies focusing on the Asian population are limited. This study presents machine learning algorithms for life expectancy based on the Asian population dataset. Comparisons are made between tree classifier models, namely, J48, Random Tree, and Random Forest. Cross validations with 10 and 20 folds are used. Results show that the highest accuracy is obtained with Random Forest with 84% accuracy with 10-fold cross-validation. This study further identifies the most significant factors that influence life expectancy prediction, which includes socioeconomic factors and educational status, health conditions and infectious disease
Lifespans of the European elite, 800–1800
I analyze the adult age at death of 115,650 European nobles from 800 to 1800. Longevity began increasing long before 1800 and the Industrial Revolution, with marked increases around 1400 and again around 1650. Declines in violent deaths from battle contributed to some of this increase, but the majority must reflect other changes in individual behavior. There are historic spatial contours to European elite mortality; Northwest Europe achieved greater adult lifespans than the rest of Europe even by 1000 AD
La genealogia come disciplina ausiliaria della genetica
Al di là del loro interesse storiografico, gli studi genealogici stanno da tempo attirando l’interesse dei genetisti in quanto possono fornire supporto empirico o contribuire a falsificare modelli puramente matematici di dinamica delle popolazioni.
Ripercorriamo brevemente la storia delle interazioni tra genealogia e genetica a partire dalla fine del XIX secolo, concentrandoci in particolare sugli studi sull’isonimia matrimoniale e sulla distribuzione e la scomparsa dei cognomi. Analizziamo criticamente i modelli teorici adottati indicando i loro limiti concettuali, che la storiografia ha permesso di individuare.
Analizziamo poi alcune recenti esperienze e proposte di ricerca, in particolare gli studi sulla ripetizione degli antenati (inbreeding) e quelli sull’antenato comune più recente (MRCA), effettuati in vivo esplorando quantitativamente il complesso, genealogicamente ben conosciuto, della nobiltà europea nell’età moderna
- …