810 research outputs found

    A Strategy analysis for genetic association studies with known inbreeding

    Get PDF
    Background: Association studies consist in identifying the genetic variants which are related to a specific disease through the use of statistical multiple hypothesis testing or segregation analysis in pedigrees. This type of studies has been very successful in the case of Mendelian monogenic disorders while it has been less successful in identifying genetic variants related to complex diseases where the insurgence depends on the interactions between different genes and the environment. The current technology allows to genotype more than a million of markers and this number has been rapidly increasing in the last years with the imputation based on templates sets and whole genome sequencing. This type of data introduces a great amount of noise in the statistical analysis and usually requires a great number of samples. Current methods seldom take into account gene-gene and gene-environment interactions which are fundamental especially in complex diseases. In this paper we propose to use a non-parametric additive model to detect the genetic variants related to diseases which accounts for interactions of unknown order. Although this is not new to the current literature, we show that in an isolated population, where the most related subjects share also most of their genetic code, the use of additive models may be improved if the available genealogical tree is taken into account. Specifically, we form a sample of cases and controls with the highest inbreeding by means of the Hungarian method, and estimate the set of genes/environmental variables, associated with the disease, by means of Random Forest. Results: We have evidence, from statistical theory, simulations and two applications, that we build a suitable procedure to eliminate stratification between cases and controls and that it also has enough precision in identifying genetic variants responsible for a disease. This procedure has been successfully used for the betathalassemia, which is a well known Mendelian disease, and also to the common asthma where we have identified candidate genes that underlie to the susceptibility of the asthma. Some of such candidate genes have been also found related to common asthma in the current literature. Conclusions: The data analysis approach, based on selecting the most related cases and controls along with the Random Forest model, is a powerful tool for detecting genetic variants associated to a disease in isolated populations. Moreover, this method provides also a prediction model that has accuracy in estimating the unknown disease status and that can be generally used to build kit tests for a wide class of Mendelian diseases

    Lung cancer mortality in a cohort of workers in a petrochemical plant: occupational or residential risk?

    Get PDF
    Gela area is an Italian polluted site qualifying for remediation because of widespread contamination from a petrochemical complex. This study investigates mortality and morbidity of the cohort of employees in Gela petrochemical plant with the aim of disentangling the health effect of work and residence. Work experience was classified in terms of job title, while an ad hoc mobility model was applied to define qualitative categories of residence in Gela as probable residents and probable commuters. Mortality rate ratio for lung cancer was 1.60 (90% CI 1.01-2.53) in workers probable resindents compared to probable commuters. For the same comparison, Hospital Discharge Prevalence Ratio for COPD was 1.39 (0.94-2.07). The crude categories of work and residence limits the interpretation of the causal nature of the study results. Despite several limitations, the results for respiratory pathologies are compatible with an etiological role of the documented contamination.The purpose of the present study is to examine the role of environmental (non occupational) exposures in lung cancer risk among petrochemical workers at a large petrochemical plant built on the Sicilian coast in the immediate vicinity of the town of Gela, Italy in 1960. The cohort included workers employed in the Gela petrochemical plant in 1960-1993. We looked at mortality rates for the period 1960-2002. An internal comparison was performed between two categories of workers with different likelihood of residence in Gela during the period of employment. The rate ratio of mortality from lung cancer comparing "probable residents" with "possible non residents," adjusted for age, calendar period, andjob classification (only blue collar, only white collar and both), was 1.66 (90% Confidence Interval 1.07-2.58). Although the information collected is quite sparse and no inferences can be made about risk sources, the results show a possible excess of residential/environmental risk from lung cancer mortality for those workers more likely to have been residents in Gela

    Age- and sex-related variations in platelet count in Italy: a proposal of reference ranges based on 40987 subjects' data

    Get PDF
    BACKGROUND AND OBJECTIVES: Although several studies demonstrated that platelet count is higher in women, decreases with age, and is influenced by genetic background, most clinical laboratories still use the reference interval 150-400×10(9) platelets/L for all subjects. The present study was to identify age- and sex-specific reference intervals for platelet count. METHODS: We analysed electronic records of subjects enrolled in three population-based studies that investigated inhabitants of seven Italian areas including six geographic isolates. After exclusion of patients with malignancies, liver diseases, or inherited thrombocytopenias, which could affect platelet count, reference intervals were estimated from 40,987 subjects with the non parametric method computing the 2.5° and 97.5° percentiles. RESULTS: Platelet count was similar in men and women until the age of 14, but subsequently women had steadily more platelets than men. The number of platelets decreases quickly in childhood, stabilizes in adulthood, and further decreases in oldness. The final result of this phenomenon is that platelet count in old age was reduced by 35% in men and by 25% in women compared with early infancy. Based on these findings, we estimated reference intervals for platelet count ×10(9)/L in children (176-452), adult men (141-362), adult women (156-405), old men (122-350) and, old women (140-379). Moreover, we calculated an extended reference interval that takes into account the differences in platelet count observed in different geographic areas. CONCLUSIONS: The age-, sex-, and origin-related variability of platelet count is very wide, and the patient-adapted reference intervals we propose change the thresholds for diagnosing both thrombocytopenia and thrombocytosis in Italy

    MATCHCLIP: locate precise breakpoints for copy number variation using CIGAR string by matching soft clipped reads

    Get PDF
    Copy number variations (CNVs) are associated with many complex diseases. Next generation sequencing data enable one to identify precise CNV breakpoints to better under the underlying molecular mechanisms and to design more efficient assays. Using the CIGAR strings of the reads, we develop a method that can identify the exact CNV breakpoints, and in cases when the breakpoints are in a repeated region, the method reports a range where the breakpoints can slide. Our method identifies the breakpoints of a CNV using both the positions and CIGAR strings of the reads that cover breakpoints of a CNV. A read with a long soft clipped part (denoted as S in CIGAR) at its 3′(right) end can be used to identify the 5′(left)-side of the breakpoints, and a read with a long S part at the 5′ end can be used to identify the breakpoint at the 3′-side. To ensure both types of reads cover the same CNV, we require the overlapped common string to include both of the soft clipped parts. When a CNV starts and ends in the same repeated regions, its breakpoints are not unique, in which case our method reports the left most positions for the breakpoints and a range within which the breakpoints can be incremented without changing the variant sequence. We have implemented the methods in a C++ package intended for the current Illumina Miseq and Hiseq platforms for both whole genome and exon-sequencing. Our simulation studies have shown that our method compares favorably with other similar methods in terms of true discovery rate, false positive rate and breakpoint accuracy. Our results from a real application have shown that the detected CNVs are consistent with zygosity and read depth information. The software package is available at http://statgene.med.upenn.edu/softprog.html

    Participation bias in the UK Biobank distorts genetic associations and downstream analyses

    Get PDF
    While volunteer-based studies such as the UK Biobank have become the cornerstone of genetic epidemiology, the participating individuals are rarely representative of their target population. To evaluate the impact of selective participation, here we derived UK Biobank participation probabilities on the basis of 14 variables harmonized across the UK Biobank and a representative sample. We then conducted weighted genome-wide association analyses on 19 traits. Comparing the output from weighted genome-wide association analyses (neffective = 94,643 to 102,215) with that from standard genome-wide association analyses (n = 263,464 to 283,749), we found that increasing representativeness led to changes in SNP effect sizes and identified novel SNP associations for 12 traits. While heritability estimates were less impacted by weighting (maximum change in h2, 5%), we found substantial discrepancies for genetic correlations (maximum change in rg, 0.31) and Mendelian randomization estimates (maximum change in βSTD, 0.15) for socio-behavioural traits. We urge the field to increase representativeness in biobank samples, especially when studying genetic correlates of behaviour, lifestyles and social outcomes

    Microsatellites and SNPs linkage analysis in a Sardinian genetic isolate confirms several essential hypertension loci previously identified in different populations

    Get PDF
    Background. A multiplicity of study designs such as gene candidate analysis, genome wide search (GWS) and, recently, whole genome association studies have been employed for the identification of the genetic components of essential hypertension (EH). Several genome-wide linkage studies of EH and blood pressure-related phenotypes demonstrate that there is no single locus with a major effect while several genomic regions likely to contain EH-susceptibility loci were validated by multiple studies. Methods. We carried out the clinical assessment of the entire adult population in a Sardinian village (Talana) and we analyzed 16 selected families with 62 hypertensive subjects out of 267 individuals. We carried out a double GWS using a set of 902 uniformly spaced microsatellites and a high-density SNPs map on the same group of families. Results. Three loci were identified by both microsatellites and SNP scans and the obtained linkage results showed a remarkable degree of similarity. These loci were identified on chromosome 2q24, 11q23.1–25 and 13q14.11–21.33. Further support to these findings is their broad description present in literature associated to EH or related phenotypes. Bioinformatic investigation of these loci shows several potential EH candidate genes, several of whom already associated to blood pressure regulation pathways. Conclusion. Our search for major susceptibility EH genetic factors evidences that EH in the genetic isolate of Talana is due to the contribution of several genes contained in loci identified and replicated by earlier findings in different human populations

    Mendelian randomisation identifies priority groups for prophylactic EBV vaccination

    Get PDF
    BACKGROUND: Epstein Barr virus (EBV) infects ~ 95% of the population worldwide and is known to cause adverse health outcomes such as Hodgkin’s, non-Hodgkin’s lymphomas, and multiple sclerosis. There is substantial interest and investment in developing infection-preventing vaccines for EBV. To effectively deploy such vaccines, it is vital that we understand the risk factors for infection. Why particular individuals do not become infected is currently unknown. The current literature, describes complex, often conflicting webs of intersecting factors—sociodemographic, clinical, genetic, environmental-, rendering causality difficult to decipher. We aimed to use Mendelian randomization (MR) to overcome the issues posed by confounding and reverse causality to determine the causal risk factors for the acquisition of EBV. METHODS: We mapped the complex evidence from the literature prior to this study factors associated with EBV serostatus (as a proxy for infection) into a causal diagram to determine putative risk factors for our study. Using data from the UK Biobank of 8422 individuals genomically deemed to be of white British ancestry between the ages of 40 and 69 at recruitment between the years 2006 and 2010, we performed a genome wide association study (GWAS) of EBV serostatus, followed by a Two Sample MR to determine which putative risk factors were causal. RESULTS: Our GWAS identified two novel loci associated with EBV serostatus. In MR analyses, we confirmed shorter time in education, an increase in number of sexual partners, and a lower age of smoking commencement, to be causal risk factors for EBV serostatus. CONCLUSIONS: Given the current interest and likelihood of a future EBV vaccine, these factors can inform vaccine development and deployment strategies by completing the puzzle of causality. Knowing these risk factors allows identification of those most likely to acquire EBV, giving insight into what age to vaccinate and who to prioritise when a vaccine is introduced

    Indoor exposure to environmental tobacco smoke and dampness: respiratory symptoms in Sardinian children- DRIAS study

    Get PDF
    Indoorexposuresathome,environmentaltobaccosmoke(ETS)andmould/dampnessadverselyaffect respiratoryhealthofchildren.DisturbiRespiratorinell’InfanziaeAmbienteinSardegna(DRIAS) (RespiratorySymptomsinchildrenandtheEnvironmentinSardegna,Italy)aimsatrelatingthe prevalenceofrespiratoryandallergicsymptomstoindoorexposuresinSardinianchildren. DRIAS,across-sectionalinvestigationofrespiratorysymptoms/diseases,usedamodifiedversionof ISAACquestionnaire,included4122childrenattending29primaryschoolsintheschoolyear 2004–2005. If bothparentssmoketheprevalenceforcurrentwheezeandcurrentasthmaisalmostdoubledin comparisonwithneversmokers,forpersistentcoughandphlegmaroleissuggestedwhenonlymother smokes.Amongmotherssmokinginpregnancy,theprevalenceofcurrentwheezeandcurrentasthmais increased. ExposuretoETSandfamilyatopyhaveajointeffectresultinginanalmosttriplingof prevalenceforcurrentwheezeandmorethanfourtimesforcurrentasthma.Exposureto‘‘dampness’’ (mouldordampness)bothduringthefirstyearoflifeandcurrentlyisassociatedwithincreased prevalenceofcurrentwheeze,persistentcoughorphlegmandcurrentrhino-conjunctivitis;ifexposure is onlyduringthefirstyearoflifeadoublingormoreofprevalenceisobservedforcurrentwheeze, current asthma,andpersistentcoughorphlegm. DRIASresultsaddevidencetothecausalroleofchildhoodexposuretoETSinthedevelopmentof respiratorysymptoms(cough,phlegm,andwheezing)andasthma.ThejointeffectofETSandfamily atopyiscorroborated.Theresultsstrengthentheevidenceforacausalassociationbetween‘‘dampness’’ and respiratoryhealth,pointingtoitspossibleindependentroleincausingasthma,along-lasting exposureentailsadoubledprevalenceforbothasthmaticandbronchitissymptoms

    [Deprivation indices in small-area studies of environment and health in Italy].

    Get PDF
    The use of deprivation indices in small-area studies of environment and health is described, with particular reference to the Italian context. Deprivation indices can represent a proxy for individual deprivation and/or contextual deprivation. In Italy, deprivation indices have been constructed using Census variables. They are applied at census tract level in studies with a local basis; in national based studies, they can be used at municipality level. In SENTIERI Project (Mortality study of residents in Italian polluted sites) an ad hoc deprivation index at municipal level was used (DI SENTIERI). Its strength and weaknesses are discussed. In addition, suggestions about the use of socioeconomic indices in small area studies of environment and health are given
    • …
    corecore