489 research outputs found

    Copula based prediction models: an application to an aortic regurgitation study

    Get PDF
    <p>Abstract</p> <p>Background:</p> <p>An important issue in prediction modeling of multivariate data is the measure of dependence structure. The use of Pearson's correlation as a dependence measure has several pitfalls and hence application of regression prediction models based on this correlation may not be an appropriate methodology. As an alternative, a copula based methodology for prediction modeling and an algorithm to simulate data are proposed.</p> <p>Methods:</p> <p>The method consists of introducing copulas as an alternative to the correlation coefficient commonly used as a measure of dependence. An algorithm based on the marginal distributions of random variables is applied to construct the <it>Archimedean </it>copulas. Monte Carlo simulations are carried out to replicate datasets, estimate prediction model parameters and validate them using Lin's concordance measure.</p> <p>Results:</p> <p>We have carried out a correlation-based regression analysis on data from 20 patients aged 17–82 years on pre-operative and post-operative ejection fractions after surgery and estimated the prediction model: Post-operative ejection fraction = - 0.0658 + 0.8403 (Pre-operative ejection fraction); p = 0.0008; 95% confidence interval of the slope coefficient (0.3998, 1.2808). From the exploratory data analysis, it is noted that both the pre-operative and post-operative ejection fractions measurements have slight departures from symmetry and are skewed to the left. It is also noted that the measurements tend to be widely spread and have shorter tails compared to normal distribution. Therefore predictions made from the correlation-based model corresponding to the pre-operative ejection fraction measurements in the lower range may not be accurate. Further it is found that the best approximated marginal distributions of pre-operative and post-operative ejection fractions (using q-q plots) are gamma distributions. The copula based prediction model is estimated as: Post -operative ejection fraction = - 0.0933 + 0.8907 × (Pre-operative ejection fraction); p = 0.00008 ; 95% confidence interval for slope coefficient (0.4810, 1.3003). For both models differences in the predicted post-operative ejection fractions in the lower range of pre-operative ejection measurements are considerably different and prediction errors due to copula model are smaller. To validate the copula methodology we have re-sampled with replacement fifty independent bootstrap samples and have estimated concordance statistics 0.7722 (p = 0.0224) for the copula model and 0.7237 (p = 0.0604) for the correlation model. The predicted and observed measurements are concordant for both models. The estimates of accuracy components are 0.9233 and 0.8654 for copula and correlation models respectively.</p> <p>Conclusion:</p> <p>Copula-based prediction modeling is demonstrated to be an appropriate alternative to the conventional correlation-based prediction modeling since the correlation-based prediction models are not appropriate to model the dependence in populations with asymmetrical tails. Proposed copula-based prediction model has been validated using the independent bootstrap samples.</p

    RISK FACTORS ASSOCIATED WITH CULLING AGE IN DAIRY CATTLE: APPLICATIONS OF FRAILTY MODELS

    Get PDF
    Culling decisions for dairy cattle are an important component of dairy herd management. To investigate risk factors for culling, farms (clusters) constitute the sampling units. Therefore, we believe that ages-at-culling may be correlated within farms. The score test on the null hypothesis of no extra-variation in survival data was not supported by age-at-culling data collected from 72 dairy farms from the province of Ontario, Canada. To correct for the intraherd correlation, three modelling approaches were used to fit the data: Population-Averaged (PA) , cluster-specific (CS), and Random Effects Models (RAEM). The modelling approaches are described and compared using the dairy cow culling data

    Estimation of Parent-Sib Correlations for Quantitative Traits Using the Linear Mixed Regression Model: Applications to Arterial Blood Pressures Data Collected From Nuclear Families

    Get PDF
    A fundamental question in quantitative genetics is whether observed variation in the phenotypic values of a particular trait is due to environmental or to biological factors. Proportion of variations attributed to genetic factors is known as heritability of the trait. Heritability is a concept that summarizes how much of the variation in a trait is due to variation in genetic factors. Often, this term is used in reference to the resemblance between parents and their offspring. In this context, high heritability implies a strong resemblance between parents and offspring with regard to a specific trait, while low heritability implies a low level of resemblance. While many applications measure the offspring resemblance to their parents using the mid-parental value of a quantitative trait of interest as an input parameter, others focus on estimating maternal and paternal heritability. In this paper we address the problem of estimating parental heritability using the nuclear family as a unit of analysis. We derive moment and maximum likelihood estimators of parental heritability, and test their equality using the likelihood ratio test, the delta method. We also use Fieller’s interval on the ratio of parental heritability to address the question of bioequivalence. The methods are illustrated on published arterial blood pressures data collected from nuclear families

    Refugee status in the Arab and Islamic tradition: A comparative study of Jiwar, Aman and the 1951 Geneva Convention relating to the status of refugees.

    Get PDF
    The essence of this study is to clarify the position of the Islamic tradition with regard to refugees based on the main Islamic Sunni sources and to examine the interface between this tradition and the 1951 Geneva Convention relating to the status of refugees. This study is the first that carries such examination since the endorsement of the 1951 Convention. This study is composed of four chapters with an introduction and a conclusion. The first chapter explains the concept of jiwar (protection), which was a governing custom in the Arabs' life in the jahiliyya, while the second chapter traces the concept of jiwar after the advent of Islam in Mecca. The purpose of the two chapters is to establish how the Prophet and his followers dealt with the jiwar custom when they were oppressed and sought jiwar the non-Muslims and also when they were able to offer jiwar to fleeing non-Muslims in Medina. The third chapter deals mainly with aman (safe conduct) in the Islamic tradition. It also defines several relevant terms, such as dar al-harb, dar al-Islam, mustajir, muhajir, musta'min and dhimmi, in order to put the concept of aman in context. Due to its particular significance, the study undertakes an extensive examination of the different interpretations of the verse (9:6) which is considered the cornerstone in legalising, by analogy, the concept of refuge in the Qur'an. The fourth and final chapter comprises a comparison between the Islamic tradition relating to the laws of aman and the 1951 Geneva Convention relating to the status of refugees. The conclusion however, highlights the close similarities between the Islamic tradition and the Geneva Convention and therefore recommends the Arab and Islamic governments to endorse the 1951 Geneva Convention relating to the status of refugees. And if necessary to make reservations concerning certain Articles taking account of the internal circumstances of each state

    Interval estimation and optimal design for the within-subject coefficient of variation for continuous and binary variables

    Get PDF
    BACKGROUND: In this paper we propose the use of the within-subject coefficient of variation as an index of a measurement's reliability. For continuous variables and based on its maximum likelihood estimation we derive a variance-stabilizing transformation and discuss confidence interval construction within the framework of a one-way random effects model. We investigate sample size requirements for the within-subject coefficient of variation for continuous and binary variables. METHODS: We investigate the validity of the approximate normal confidence interval by Monte Carlo simulations. In designing a reliability study, a crucial issue is the balance between the number of subjects to be recruited and the number of repeated measurements per subject. We discuss efficiency of estimation and cost considerations for the optimal allocation of the sample resources. The approach is illustrated by an example on Magnetic Resonance Imaging (MRI). We also discuss the issue of sample size estimation for dichotomous responses with two examples. RESULTS: For the continuous variable we found that the variance stabilizing transformation improves the asymptotic coverage probabilities on the within-subject coefficient of variation for the continuous variable. The maximum like estimation and sample size estimation based on pre-specified width of confidence interval are novel contribution to the literature for the binary variable. CONCLUSION: Using the sample size formulas, we hope to help clinical epidemiologists and practicing statisticians to efficiently design reliability studies using the within-subject coefficient of variation, whether the variable of interest is continuous or binary

    Likelihood Ratio Testing for Admixture Models with Application to Genetic Linkage Analysis

    Get PDF
    We consider likelihood ratio tests (LRT) and their modifications for homogeneity in admixture models. The admixture model is a special case of two component mixture model, where one component is indexed by an unknown parameter while the parameter value for the other component is known. It has been widely used in genetic linkage analysis under heterogeneity, in which the kernel distribution is binomial. For such models, it is long recognized that testing for homogeneity is nonstandard and the LRT statistic does not converge to a conventional 2 distribution. In this paper, we investigate the asymptotic behavior of the LRT for general admixture models and show that its limiting distribution is equivalent to the supremum of a squared Gaussian process. We also provide insights on the connection and comparison between LRT and alternative approaches in the literature, mostly modifications of LRT and score tests, including the modified or penalized LRT (Fu et al., 2006). The LRT is an omnibus test that is powerful against general alternative hypothesis. In contrast, alternative approaches may be slightly more powerful against certain type of alternatives, but much less powerful for other types. Our results are illustrated by simulation studies and an application to a genetic linkage study of schizophrenia

    Modeling and analysis of disease and risk factors through learning Bayesian networks from observational data

    Full text link
    This paper focuses on identification of the relationships between a disease and its potential risk factors using Bayesian networks in an epidemiologic study, with the emphasis on integrating medical domain knowledge and statistical data analysis. An integrated approach is developed to identify the risk factors associated with patients' occupational histories and is demonstrated using real-world data. This approach includes several steps. First, raw data are preprocessed into a format that is acceptable to the learning algorithms of Bayesian networks. Some important considerations are discussed to address the uniqueness of the data and the challenges of the learning. Second, a Bayesian network is learned from the preprocessed data set by integrating medical domain knowledge and generic learning algorithms. Third, the relationships revealed by the Bayesian network are used for risk factor analysis, including identification of a group of people who share certain common characteristics and have a relatively high probability of developing the disease, and prediction of a person's risk of developing the disease given information on his/her occupational history. Copyright © 2007 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/58076/1/893_ftp.pd
    • 

    corecore