UCL (University College London), Centre for Advanced Spatial Analysis (UCL)
Abstract
There are growing needs to understand the nature and detailed composition of ethnicgroups in today?s increasingly multicultural societies. Ethnicity classifications areoften hotly contested, but still greater problems arise from the quality and availabilityof classifications, with knock on consequences for our ability meaningfully tosubdivide populations. Name analysis and classification has been proposed as oneefficient method of achieving such subdivisions in the absence of ethnicity data, andmay be especially pertinent to public health and demographic applications. However,previous approaches to name analysis have been designed to identify one or a smallnumber of ethnic minorities, and not complete populations.This working paper presents a new methodology to classify the UK population andneighbourhoods into groups of common origin using surnames and forenames. Itproposes a new ontology of ethnicity that combines some of its multidimensionalfacets; language, religion, geographical region, and culture. It uses data collected atvery fine temporal and spatial scales, and made available, subject to safeguards, at thelevel of the individual. Such individuals are classified into 185 independentlyassigned categories of Cultural Ethnic and Linguistic (CEL) groups, based on theprobable origins of names. We include a justification for the need of classifyingethnicity, a proposed CEL taxonomy, a description of how the CEL classification wasbuilt and applied, a preliminary external validation, and some examples of current andpotential applications