31 research outputs found

    Intergenerational mobility in a mid-Atlantic economy: Canada, 1871-1901

    Get PDF
    This paper uses new linked full-count census data for Canada to document intergenerational occupational mobility from 1871 to1901. We find significant differences between Canadian regions and language groups, with linguistic minorities experiencing notably lower rates of intergenerational mobility. International comparisons place Canada midway between other economies in the Americas and the most mobile European societies. Decompositions of overall mobility show that the Canadian experience shared the New World feature of high mobility from manual occupations, but also the Old World feature of greater persistence in white collar jobs

    Population Analysis of the Settlement Movement in Western Canada

    Get PDF
    Introduction The Canadian settlement of the west, via granting free homesteads, is perhaps one of the largest public policy undertakings in the nation's history. However, little is known about the homesteaders themselves, where they came from, how long they stayed and the settlement environment that was created at the time. Objectives and Approach This research adopts a detailed population analysis of the settlement movement in Western Canada. In addition to outlining the social and economic characteristics of the homesteaders, the project answers the following central question: Did it create a stable society of settlers or did it create a field for speculative investment? The data consist of machine readable individual level databases containing detailed information on and stories from circa 170,000 Alberta homesteaders. These homesteaders will be individually linked to three twentieth century Canadian censuses and the Canadian Pacific Railway's land records to provide an unprecedented holistic analysis of Alberta's early European population. Results We report on the linkage methodology used to integrate all these data sources. In addition, we discuss any particular issues we encountered given the nature of the historical data. We describe the data cleaning and standardization that was undertaken to facilitate the linkage process. We present and discuss the linkage results obtained, how much of the population was linked and what are the characteristics of those we couldn’t link. We expect that this research will shed new light on persistence rates, trajectories of family composition, nature of labour market adjustment, degrees of social/gender inequality and impacts on regional development. The results will challenge many myths concerning homesteaders and their impact on western Canada and in the process provoke renewed discussion of western Canadian history. Conclusion/Implications The research will inform and be informed by work currently being undertaken on migration patterns at the international level. Finally the research has implications for understanding the legacies of rapid population movements, state formation, public policy and national identities in the present

    Bias, accuracy and sample size in the systematic linking of historical records

    Get PDF
    Introduction Linking distinct historical sources on an automated basis directs attention to the quality and representativeness of the linked data created by these systems. Linking with time-invariant personal characteristics arguably minimizes bias or departures from representativeness even though a wider set of features might generate more links. Objectives and Approach The objective of this research is to compare, evaluate and understand bias when two linking methodologies are employed on the same data sources. Our approach to studying this problem is by comparing linked records from Canadian censuses (linking 1871 to 1881) generated by two different linking strategies. The first method is a support vector machine based classification model on time-invariant individual characteristics. Using this method a large number of multiple matches is generated, as records look similar on a small number of time-invariant individual characteristics. The second method adds a second stage of disambiguating multiple matches using family information. Results We compare the links produced by the two methods used in the study and we discuss the results. The comparison is in terms of number of links produced, their quality (false positive rate) and the bias of the linked data produced. A complication is that there are many dimensions of bias. Even time-invariant criteria typically generate some bias. As expected, the two-step process produces a larger linked sample. Interestingly, it also produces a lower error rate and different patterns of bias. Both methods understate the Quebec-born, French-ethnicity, the unmarried and adolescents. Unexpectedly, the bias in favour of married people is larger using individual (first method) than family information (second method). However, family-based linking does over-represent young children. Conclusion/Implications Results suggest that neither method will be universally preferable. Rather, the choice of research question may affect the preferred balance of biases and link rate. Fortunately, the advance of computational capacity allows a researcher to select a method that generate links most appropriate for the problem at hand

    Establishing an International Data Linkage Repository Workgroup Toward a Benchmarking Repository

    Get PDF
    Introduction Access to real data with diverse attributes is critical for effective development of any data analytic algorithm. Benchmarking data repositories have all been vital to the development of research communities focused on algorithm development. This work reports on the development of such a data repository for record linkage. Objectives and Approach Establishing a common benchmarking repository of real data can propel a field to the next level of rigor by facilitating comparison of different algorithms, understanding what type of algorithms work best under certain real data conditions and problem domains, promoting transparency and replicability of research, and creating incentives for proper citations for contributions. In addition, benchmarking repositories can bring together the diverse stakeholders (e.g., computer scientists, statisticians, data custodians, data users including social, behaviour, economic, and health (SBEH) scientists) that can advance the field more effectively than could researchers from any single discipline. Results In Fall 2016, international leaders in record linkage formed a Data Linkage Repository workgroup (DLRep) to establish a benchmarking data repository for record linkage. The workgroup is working in collaboration with The Inter-university Consortium for Political and Social Research (ICPSR) to host the site data repository planned for release in Summer 2018. The repository for record linkage research will house various types of real data that require linking with metadata, unique handles for citations, proposed algorithms for evaluation criteria, and a platform for posting, sharing, and comparing results as well as citations of relevant papers. Some datasets will have the gold standard published that researchers can evaluate their results against. Other datasets will gather results to build the gold standard as a community. Conclusion/Implications Record linkage methodology is important to domains where data needs to be integrated from multiple sources, including diverse disciplines. Establishing an international interdisciplinary research community around a benchmark data linkage repository to validate and compare linkage algorithms is crucial to fully realizing the social benefits of data about people

    Pharynx Reconstruction and Quality of Life

    Get PDF
    Patients who are diagnosed with squamous cell carcinoma of the pharynx have a first delayed presentation, with advanced stages of the disease. Therefore, they frequently require a multimodal approach—by surgery, radio, and chemotherapy. Due to anatomic spatial limits and particularities, therapy can imply large organ resection with difficulties in reconstruction. Nowadays, there is a paradigm shift in the management of this pathology, with significant first referral to oncology departments and initiation as the first line of treatment of radio/radio-chemotherapy. As a consequence, salvage surgery may be mandatory in some selected cases. The proposed chapter will address the oncological particularities of the pharynx, with a focus on the oro- and hypopharynx, ways of reconstruction after oncological ablative surgery of these segments, and impact on quality of life (QoL) index. Speech, respiratory, and deglutition rehabilitation of these patients is essential and will be a distinct topic. This paper will have the structure of a literature review with clinical examples of reconstruction from ENT and Head and Neck Surgery Department of Coltea Clinical Hospital, Bucharest. Reconstruction methods used in our clinic are regional flaps and biocompatible prostheses in advanced stages. QoL index in our clinic is assessed with questionnaires developed by the European Organization for Research and Treatment of Cancer – EORTC QLQ C30

    Classifying microarray data with association rules

    Full text link
    peer reviewe

    Learning to use a learned model: A two-stage approach to classification

    No full text
    Association rule-based classifiers have recently emerged as competitive classification systems. However, there are still deficiencies that hinder their performance. One deficiency is the use of rules in the classification stage. Current systems assign classes to new objects based on the best rule applied or on some predefined scoring of multiple rules. In this paper we propose a new technique where the system automatically learns how to use the rules. We achieve this by developing a two-stage classification model. First, we use association rule mining to discover classification rules. Second, we employ another learning algorithm to learn how to use these rules in the prediction process. Our two-stage approach outperforms C4.5 and RIPPER on the UCI datasets in our study, and outperforms other rulelearning methods on more than half the datasets. The versatility of our method is also demonstrated by applying it to text classification, where it equals the performance of the best known systems for this task, SVMs
    corecore