31 research outputs found

    Effective Sample Size: Quick Estimation of the Effect of Related Samples in Genetic Case-Control Association Analyses

    Get PDF
    Correlated samples have been frequently avoided in case-control
genetic association
 studies in part because the methods for handling them are either not
easily implemented or not widely known. We
advocate one method for case-control association analysis of correlated
samples -- the effective sample size method -- as a simple and
accessible approach that does not require specialized computer programs.
The effective sample size method captures the variance inflation
of allele frequency estimation exactly, and can be used to modify the
chi-square test statistic, p-value, and 95% confidence interval of
odds-ratio simply by replacing the apparent number of allele counts with the
effective ones. For genotype frequency estimation, although a single
effective sample size is unable to completely characterize the variance inflation,
an averaged one can satisfactorily approximate the simulated result.
The effective sample size method is applied to the rheumatoid arthritis
siblings data collected from the North American Rheumatoid Arthritis Consortium (NARAC)
to establish a significant association with the interferon-induced
helicasel gene (IFIH1) previously being identified as a type 1 diabetes
susceptibility locus. Connections between the effective sample size
method and other methods, such as generalized estimation equation,
variance of eigenvalues for correlation matrices, and genomic controls,
are also discussed.
&#xa

    Understanding MMPI-2 response structure between schizophrenia and healthy individuals

    Get PDF
    BackgroundUsing Minnesota Multiphasic Personality Inventory-2 (MMPI-2) clinical scales to evaluate clinical symptoms in schizophrenia is a well-studied topic. Nonetheless, research focuses less on how these clinical scales interact with each other.AimsInvestigates the network structure and interaction of the MMPI-2 clinical scales between healthy individuals and patients with schizophrenia through the Bayesian network.MethodData was collected from Wuhan Psychiatric Hospital from March 2008 to May 2018. A total of 714 patients with schizophrenia and 714 healthy subjects were identified through propensity score matching according to the criteria of the International Classification of Diseases (ICD-11). Separated MMPI-2 clinical scales Bayesian networks were built for healthy subjects and patients with schizophrenia, respectively.ResultsThe Bayesian network showed that the lower 7 scale was a consequence of the correlation between the lower 2 scale and the greater 8 scale. A solely lower 7 scale does yield neither a lower 2 scale nor a higher 8 scale. The proposed method showed 72% of accuracy with 78% area under the ROC curve (AUC), similar to the previous studies.LimitationsThe proposed method simplified the continuous Bayesian network to predict binary outcomes, including other categorical data is not explored. Besides, the participants might only represent an endemic as they come from a single hospital.ConclusionThis study identified MMPI-2 clinical scales correlation and built separated Bayesian networks to investigate the difference between patients with schizophrenia and healthy people. These differences may contribute to a better understanding of the clinical symptoms of schizophrenia and provide medical professionals with new perspectives for diagnosis

    New stopping criteria for segmenting DNA sequences

    Get PDF
    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When this stopping criterion is applied to a left telomere sequence of yeast Saccharomyces cerevisiae and the complete genome sequence of bacterium Escherichia coli, borders of biologically meaningful units were identified (e.g. subtelomeric units, replication origin, and replication terminus), and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.Comment: 4 pages, 4 figures, Physical Review Letters, to appea

    Two Glycosylation Sites in H5N1 Influenza Virus Hemagglutinin That Affect Binding Preference by Computer-Based Analysis

    Get PDF
    Increasing numbers of H5N1 influenza viruses (IVs) are responsible for human deaths, especially in North Africa and Southeast Asian. The binding of hemagglutinin (HA) on the viral surface to host sialic acid (SA) receptors is a requisite step in the infection process. Phylogenetic analysis reveals that H5N1 viruses can be divided into 10 clades based on their HA sequences, with most human IVs centered from clade 1 and clade 2.1 to clade 2.3. Protein sequence alignment in various clades indicates the high conservation in the receptor-binding domains (RBDs) is essential for binding with the SA receptor. Two glycosylation sites, 158N and 169N, also participate in receptor recognition. In the present work, we attempted to construct a serial H5N1 HA models including diverse glycosylated HAs to simulate the binding process with various SA receptors in silico. As the SA-α-2,3-Gal and SA-α-2,6-Gal receptor adopted two distinctive topologies, straight and fishhook-like, respectively, the presence of N-glycans at 158N would decrease the affinity of HA for all of the receptors, particularly SA-α-2,6-Gal analogs. The steric clashes of the huge glycans shown at another glycosylation site, 169N, located on an adjacent HA monomer, would be more effective in preventing the binding of SA-α-2,3-Gal analogs
    corecore