33 research outputs found
Breast cancer rates by stage and household income percentile, New York State, 2006-2015
This file contains measured and modeled breast cancer rates by stage and median household income percentile in New York State, 2006-2015. It accompanies the book chapter, "Spatial and Contextual Analyses of Stage at Diagnosis" by Francis Boscoe and Lindsey Hutchison, in <i>Geospatial
Approaches to Energy Balance and Breast Cancer</i>. D Berrigan, NA Berger, eds.
Berlin: Springer, 2018.<p></p>.<br><div>4,835 census tracts in New York State were divided into percentiles based on median household income, using data from the 2006-2010 and 2011-2015 editions of American Community Survey Table S1903. Census tracts are defined here:</div><div><br></div><div>https://figshare.com/articles/Population_Estimates_by_Census_Tract_New_York_State_by_Age_and_Sex_1990-2016_/6813029</div><div><br></div><div>58 of the 4,893 census tracts in this file did not have households (primarily college campuses, prisons, and military bases) and thus had no reported median household income and were excluded, leaving 4,835.</div><div><div><br></div><div>200,022 cases of breast cancer diagnosed among New York State residents from 2006-2015 were assigned an income percentile. Cases diagnosed between 2006-2010 were assigned based on the 2006-2010 edition of ACS Table S1903 and cases diagnosed between 2011-2015 were assigned based on the 2011-2015 edition.</div><div><br></div><div>Directly-adjusted incidence rates were calculated for all cancers and for those diagnosed at in situ, local, regional, and distant stage, using the SEER Summary Stage 2000 staging system. <br></div></div><div><br></div><div>The file contains the following fields: income percentile; rates for all cancers, in situ, local, regional, and distant stage; and modeled rates for all cancers, in situ, local, regional and distant stages. The modeled rates used a polynomial of order 3. The equations of the best-fit lines and r-squared values, to 4 decimal places or significant figures, are as follows:</div><div><br></div><div>All cancers: y = 0.0001986x<sup>3 </sup>- 0.02035x<sup>2 </sup>+ 1.0691x + 133.7353, r<sup>2</sup> = 0.96</div><div><br></div><div>In situ: y = 0.00008906x<sup>3 </sup>- 0.007555x<sup>2 </sup>+ 0.3169x + 27.5728, r<sup>2</sup> = 0.96</div><div><br></div><div>Local: y = 0.0001436x<sup>3 </sup>- 0.01919x<sup>2 </sup>+ 1.0526x + 58.4627, r<sup>2</sup> = 0.94</div><div><br></div><div>Regional: y = -0.00001676x<sup>3 </sup>+ 0.003410x<sup>2 </sup>- 0.1389x + 37.6709, r<sup>2</sup> = 0.41</div><div><br></div><div>Distant: y = -0.00001724x<sup>3 </sup>+ 0.002989x<sup>2 </sup>- 0.1615x + 10.0288, r<sup>2</sup> = 0.32</div><div><br></div><div><br></div
Toward an all-purpose base map for the display of North American cancer data
This poster describes the process by which a base map was developed for displaying cancer data for U.S. states, selected sub-state areas, territories, and Canadian provinces. Emphasis was placed on line work which did not obscure Washington DC, Prince Edward Island, or similarly small geographic areas
Implications of Subdividing the 85 and Over Age Category in Cancer Surveillance
<p>Age-specific mortality above age 85 reveals a variety of compelling patterns that have not been well-studied, including a leveling or decline in risk for certain cancer types. Taken in combination with the rapid growth of this population, the need for greater age specificity is clear. </p
Trends in Reporting Delays to United States Central Cancer Registries
<p>An evaluation of reporting delay to 10 US cancer registries (the SEER 9 registries plus New York) from 1999-2010 reveals solid downward trends in the amount of delay, which equates to better-quality data. Leukemia and myeloma continue to lag as these sites can be diagnosed and treated outside a hospital setting.</p
Census Tract population estimates by age and sex, New York State, by year, 1990-2014, using 2010 census tract definitions
<p>Very often in my work I am requested to calculate disease
rates for small areas for time periods that fall between censuses or span
multiple censuses, such as 1995-2013. In the past I have used population
estimates from private vendors, but these have had two important limitations:
one, they are proprietary and cannot be shared, and two, they often contain significant
omissions and errors. I decided instead to calculate my own populations using
publicly available data and established interpolation methods.</p>
<p>To generate the data here, I began with the census tract
populations by age (5-year age groups) and sex published in the 1990, 2000, and
2010 federal censuses (citations to exact tables to be added). These were
converted to 2010 census definitions using the Longitudinal Tract Data Base
(LTDB), available here: <a href="http://www.s4.brown.edu/us2010/Researcher/Bridging.htm">http://www.s4.brown.edu/us2010/Researcher/Bridging.htm</a>.
The LTDB provides precise conversions between different censuses. For example, 45.4%
of the population of 1990 Bronx census tract 50 is assigned to 2010 tract 50.01,
while 54.6% is assigned to tract 50.02. Census tracts with zero population in
all three decades, consisting of water and certain parks and cemeteries in New
York City, were omitted. The resulting file has data for 4,893 tracts.</p>
<p>Each age-sex group was summed to the county total, and
compared with the county total as published by the National Cancer Institute’s
SEER program. The SEER counts make adjustments to the counts by race and
ethnicity, adjust the counts to reflect totals as of July rather than April, and other small
enhancements, all of which are documented on their web page, <a href="http://seer.cancer.gov/popdata/">http://seer.cancer.gov/popdata/</a>. The
census tract counts were then proportionally adjusted to match the SEER totals.
For example, if the census tracts in a particular county added to 127 males
aged 5-9, and the SEER total for this county was 131, then the count in each
tract was multiplied by 131/127. This resulted in fractional populations, which
were retained. Any user not desirous of fractional populations can simply round
the values given here.</p>
<p>Next, geometric interpolation between census years was used
to estimate tract-level counts for all of the non-census years, using the Das
Gupta method that has been used extensively by the Census Bureau and described
here: <a href="https://www.census.gov/popest/methodology/intercensal_nat_meth.pdf">https://www.census.gov/popest/methodology/intercensal_nat_meth.pdf</a>.
For census tracts that are growing in population, this method results in more
of the growth occurring later in the period. For census tracts that are
shrinking, it results in more of the shrinkage occurring earlier in the period.
For the relatively small numbers seen in individual census tracts by age and
sex, the results are not very different than those that would have been obtained
from linear interpolation. (For the years after 2010, this step was skipped
because the 2020 census obviously does not yet exist). These interpolated
counts were then proportionally adjusted to match the SEER totals by year and
county, using the same procedure as above.</p>
<p><b>Data dictionary</b></p>
<p>The data file is a comma-separated file containing the
following variables:</p>
<p>Year</p>
<p>Geoid10 – 11 digit code consisting of state FIPS code (36
for New York), county FIPS code (001-123 for New York), and census tract (6
digits, with leading and trailing zeroes as needed). These are the identical
values used in many Census tables.</p>
<p>M0 – male population aged 0</p>
<p>M1 – male population aged 1-4</p>
<p>M2 – male population aged 5-9</p>
<p>…</p>
<p>M17 – male population aged 80-84</p>
<p>M18 – male population aged 85+</p>
<p>F0 – female population aged 0</p>
<p>…</p>
<p>F18 – female population aged 85+</p>
<p><b>Future work</b></p>
<p>Future versions of these data may add some or all of the
following:</p>
<p>-
Additional states</p>
<p>-
Counts by race and ethnicity</p>
<p>-
Incorporation of a method to capture abrupt
changes in census tract populations, such as when a new retirement community is
constructed. The idea is to use American Community Survey population estimates
to identify such instances.</p>
<p>-
Incorporation of post-censal corrections. Here,
I have used the official tables published after each census. They do not
incorporate the various small corrections that were made as a result of appeals
and identification of errors. These corrections are mainly given in narrative form
rather than in tables, and so incorporating them may be somewhat involved.</p>
<p> - Francis Boscoe</p><p>University at Albany</p><p></p>
<p>Department of Epidemiology and
Biostatistics</p>
<p> Send questions, comments to
[email protected]</p><p></p
Population Estimates by Census Tract, New York State, by Age and Sex, 1990-2016.
This file contains population estimates by age and sex and single year for census tracts in New York State, from 1990-2016.<div><br></div><div>Iterative proportional fitting was used to develop populations that are consistent with official Census Bureau tract-level populations from 1990, 2000, and 2010 and single-year county-level population estimates published by the SEER program of the National Cancer Institute (https://seer.cancer.gov/popdata/). </div><div><br></div><div>The Longitudinal Tract Database (LTDB) (https://s4.ad.brown.edu/projects/diversity/researcher/bridging.htm) was used to report populations using 2010 census tract boundaries.</div><div><br></div><div>In effect, the approach assume that population growth or reduction at the tract level tracks what is happening at the county level. This is an improvement over linear or geometric interpolation between census years, but is still far from perfect. Census tracts can undergo rapid year-to-year population change, such as when new housing is constructed or, less frequently, demolished. An extreme example is census tract 1.04 in Westchester County, New York, which had a population of 0 in all 3 census years, as it was located entirely within an industrial area. Since 2010, multiple large high-rise condominiums have been constructed here, so that the population in 2018 is probably now in the thousands, though any estimation or projection method tied to the 2010 census will still count 0 people here. </div><div><br></div><div>It is conceivable that address files from the United States Postal Service or other sources could be used to capture these kinds of changes; I am unaware of any attempts to do this.</div><div><br></div><div>The file contains data for 4893 census tracts. It has been restricted to census tracts with nonzero populations in at least one of the census years. There are other census tracts consisting entirely of water, parkland, or non-residential areas as in the example above, which have been omitted.</div><div><br></div><div>These data are used for the calculation of small-area cancer rates in New York State. </div
The foreign-born population in New York City and environs, 2012-2016
<p>The most frequent foreign country of birth
was mapped using data from tables B05006 and B05002 (for Puerto Rico) of the
American Community Survey, using the 2012-2016 5-year estimates.<sub> </sub>Census
tracts were only shaded when the share of the most frequent country of birth
was more than 10% of the total population and if the total population of the
census tract was at least 100. All birth countries which were the most frequent
in at least 10 census tracts in New York and New Jersey appear in the legend;
30 other countries appearing fewer than 10 times were grouped as “other”.
The color scheme used was that of Trubestskoy, who developed a palette of
20 distinct and nameable colors. </p
A Medicare-associated impact on cancer survival at age 65 in the United States, 2004-2013.
For low-survival cancers such as lung and liver, cancer survival among 64-year olds (the age before Medicare eligibility) appears slightly but significantly worse than that for 65-year olds (the age of Medicare eligibility). This could be explained by reduced access to treatment in the period before Medicare eligibility
Persistent outliers among state-level causes of death, 1999-2013
<p>This table identifies all state-level causes of death that were at least twice the national rate in each of the periods 1999-2003, 2004-2008, and 2009-2013. Data are based on the 113 Cause of Death list and are based on the CDC's Underlying Cause of Death file accessible at: http://wonder.cdc.gov/ucd-icd10.html.</p
Evaluation of LexisNexis Batch Solutions in the New York State Cancer Registry
<p>Using Lexis Nexis Batch Solutions, the New York State Cancer Registry was able to identify substantial numbers of missing addresses, birth dates, and social security numbers, for persons diagnosed as far back as 1976.</p