483 research outputs found

    Inaccurate age and sex data in the Census PUMS files: evidence and implications

    Get PDF
    We discover and document errors in public use microdata samples ("PUMS files") of the 2000 Census, the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey. For women and men ages 65 and older, age- and sex-specific population estimates generated from the PUMS files differ by as much as 15% from counts in published data tables. Moreover, an analysis of labor force participation and marriage rates suggests the PUMS samples are not representative of the population at individual ages for those ages 65 and over. PUMS files substantially underestimate labor force participation of those near retirement ages and overestimate labor force participation rates of those at older ages. These problems were an unintentional by-product of the misapplication of a newer generation of disclosure avoidance procedures carried out on the data. The resulting errors in the public use data could significantly impact studies of people ages 65 and older, particularly analyses of variables that are expected to change by age.Census ; Population ; Labor supply

    Inaccurate Age and Sex Data in the Census PUMS Files: Evidence and Implications

    Get PDF
    We discover and document errors in public use microdata samples ("PUMS files") of the 2000 Census, the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey. For women and men ages 65 and older, age- and sex-specific population estimates generated from the PUMS files differ by as much as 15% from counts in published data tables. Moreover, an analysis of labor force participation and marriage rates suggests the PUMS samples are not representative of the population at individual ages for those ages 65 and over. PUMS files substantially underestimate labor force participation of those near retirement ages and overestimate labor force participation rates of those at older ages. These problems were an unintentional by-product of the misapplication of a newer generation of disclosure avoidance procedures carried out on the data. The resulting errors in the public use data could significantly impact studies of people ages 65 and older, particularly analyses of variables that are expected to change by age.Current Population Survey, American Community Survey, Census, disclosure avoidance, aging, data, sex, labor force participation, marriage

    Discovery of a Visual T-Dwarf Triple System and Binarity at the L/T Transition

    Get PDF
    We present new high contrast imaging of 8 L/T transition brown dwarfs using the NIRC2 camera on the Keck II telescope. One of our targets, the T3.5 dwarf 2MASS J08381155 + 1511155, was resolved into a hierarchal triple with projected separations of 2.5+/-0.5 AU and 27+/-5 AU for the BC and A(BC) components respectively. Resolved OSIRIS spectroscopy of the A(BC) components confirm that all system members are T dwarfs. The system therefore constitutes the first triple T-dwarf system ever reported. Using resolved photometry to model the integrated-light spectrum, we infer spectral types of T3, T3, and T4.5 for the A, B, and C components respectively. The uniformly brighter primary has a bluer J-Ks color than the next faintest component, which may reflect a sensitive dependence of the L/T transition temperature on gravity, or alternatively divergent cloud properties amongst components. Relying on empirical trends and evolutionary models we infer a total system mass of 0.034-0.104 Msun for the BC components at ages of 0.3-3 Gyr, which would imply a period of 12-21 yr assuming the system semi-major axis to be similar to its projection. We also infer differences in effective temperatures and surface gravities between components of no more than ~150 K and ~0.1 dex. Given the similar physical properties of the components, the 2M0838+15 system provides a controlled sample for constraining the relative roles of effective temperature, surface gravity, and dust clouds in the poorly understood L/T transition regime. Combining our imaging survey results with previous work we find an observed binary fraction of 4/18 or 22_{-8}^{+10}% for unresolved spectral types of L9-T4 at separations >~0.1 arcsec. This translates into a volume-corrected frequency of 13^{-6}_{+7}%, which is similar to values of ~9-12% reported outside the transition. (ABRIDGED)Comment: Accepted for publication in the Astrophysical Journal. 23 pages, 12 figure

    Inaccurate age and sex data in the Census PUMS files: Evidence and Implications

    Get PDF
    We discover and document errors in public use microdata samples ( PUMS files ) of the 2000 Census, the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey. For women and men ages 65 and older, age- and sex-specific population estimates generated from the PUMS files differ by as much as 15% from counts in published data tables. Moreover, an analysis of labor force participation and marriage rates suggest the PUMS samples are not representative of the population at individual ages for those ages 65 and over. PUMS files substantially underestimate labor force participation of those near retirement ages and overestimate labor force participation rates of those at older ages. These problems were an unintentional by-product of the misapplication of a newer generation of disclosure avoidance procedures carried out on the data. The resulting errors in the public use data could significantly impact studies of people ages 65 and older, particularly analyses of variables that are expected to change by age

    Inaccurate age and sex data in the Census PUMS files: Evidence and Implications

    Get PDF
    We discover and document errors in public use microdata samples ("PUMS files") of the 2000 Census, the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey. For women and men ages 65 and older, age- and sex-specific population estimates generated from the PUMS files differ by as much as 15% from counts in published data tables. Moreover, an analysis of labor force participation and marriage rates suggests the PUMS samples are not representative of the population at individual ages for those ages 65 and over. PUMS files substantially underestimate labor force participation of those near retirement ages and overestimate labor force participation rates of those at older ages. These problems were an unintentional by-product of the misapplication of a newer generation of disclosure avoidance procedures carried out on the data. The resulting errors in the public use data could significantly impact studies of people ages 65 and older, particularly analyses of variables that are expected to change by age.

    Building a repository for record linkage

    Get PDF
    ICPSR is building LinkageLibrary, a repository and community space for researchers involved in linking and combining datasets, as a collaboration between social, statistical, and computer scientists. Unlike surveys or experiments where causal and outcome variables are measured in tandem, it is often necessary when working with organic, non-design data to link to other measures. This makes linkage methodologies particularly important when conducting analyses using administrative data. A common benchmarking repository of linkage methodologies will propel the field to the next level of rigor by facilitating comparison of different algorithms, understanding which types of algorithms work best under different conditions and problem domains, promoting transparency and replicability of research, and encouraging proper citation of methodological contributions and their resulting datasets. It will bring together the diverse scholarly communities (e.g., computer scientists, statisticians, and social, behavioral, economic, and health (SBEH) scientists) who are currently addressing these challenges in disparate ways that do not build on one another’s work. Improving linkage methodologies is critical to the production of representative samples, and thus to unbiased estimates of a wide variety of social and economic phenomena. The repository will accelerate the development of new record linkage algorithms and evaluation methods, improve the reproducibility of analyses conducted on integrated data, allow comparisons on same and different data, and move forward the provision of privacy-aware integrated data. The presentation will focus on lessons learned while building the repository and the community, and introduce the LinkageLibrary website

    Resolving a One-Year Ecesis Interval for Alaska Paper Birch: Dating a Rockfall Event, Wishbone Hill, Southcentral Alaska

    Get PDF
    Numerous large boulders at the base of Wishbone Hill, northeast of Anchorage, Alaska, suggest a historic rockfall event and potential for future surface instability, putting lives and property at risk. The source of the rockfall-boulders is an exposed syncline with a cliff face composed of conglomerate. The age of trees growing atop boulders provides a minimum exposure-age of those boulders and, thus, the rockfall event. To determine when the rockfall occurred, we dated trees growing atop the boulders using tree-ring samples collected from 30 Alaska paper birch trees. After mounting and polishing, each tree-ring sample was dot-counted, and tree-ring widths were measured using Measure J2X software to generate a master chronology (1938-2017). To estimate the youngest age for the rockfall event, we recorded pith-year for each sample. For samples lacking a pith (n=21), we used pith indicators to match existing rings to diagrams of corresponding ring widths, projecting approximate pith for each sample. All samples we corrected for sampling height (mean=0.8m) using a low estimate growth rate (0.6m/yr). The oldest birch tree sampled included pith and, with height correction, we estimate a germination year of 1936. When using first-year growth as an event’s temporal marker, accounting for the ecesis interval, the time between the availability of a new surface (i.e., boulders) and germination provides a more representative date of the event than using the pith/germination date alone. Considering birch ecesis and primary observations recorded in 1935, we propose that the rockfall event most likely occurred in 1934-1935. This finding suggests an ecesis interval as low as one year for Alaska paper birch in fresh rockfall areas. The risk of another destabilizing event may prompt those utilizing this area for recreational and residential purposes to reconsider future use

    Reconciling Parent-Child Relationships across US Administrative Datasets

    Get PDF
    Introduction Population data capture children, parents, relatives, and others moving in and out of households. The U.S. has seen falling marriage rates, and increases in multigenerational households and complex families, young children living with grandparents, and adult children living with parents. Robust parent-child linkages are critical to understand these demographic shifts. Objectives and Approach We construct and validate parent-child linkages over a century to observe how U.S. households are changing over time. The three largest person-based datafiles in the U.S. are the decennial censuses, the Social Security Administration transaction file, and individual tax returns from the Internal Revenue Service. These sources operationalize relationships differently, capture data at various frequencies, and gather the data for unique purposes. We use probabilistic matching to observe and reconcile parent-child relationships across these sources. The data include a variety of personal identifiers including name, date of birth, parents’ names, address, and place of birth that support matching and validation. Results We find that understanding the content, consistency, and coverage of the files before matching is critical for high quality linkages. The representativeness of the parent-child relationship file improves over time, with the weakest coverage for the Greatest Generation and the strongest coverage for Millennials. Coverage varies by source: tax data underrepresent non-white children and have duplicate records for SSNs, while names and dates of birth are missing from Census data. Multiple match rates differ among demographic groups and over time. In the matching process, the blocking variables rely on common variables across the population datasets. Our approach provides robust entity resolution for women, despite married-maiden name changes. We describe challenges due to data problems in old census records and validation changes in social security data. Conclusion/Implications We conduct a successful reconciliation of parent-child relationships in U.S. population level files. The project supports operational and research uses, such as the 2020 Census. We will extend this work using graph matching and will expand the method to validate other relationship links including spouses and siblings
    • …
    corecore