5 research outputs found
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Datasets related to a study aimed to identify genetic markers of CDA by subphenotypes associated with cardiotoxicity
Who produced the data? The data has been created by the authors listed above.
Is the title specific enough? "Datasets related to a study aimed to identify genetic markers of CDA by subphenotypes associated with cardiotoxicity."
Why has the data been created? These datasets are supplementary material with which the principal and supplementary figures and tables of our indicated work were generated.
What limitations do the data have (for example, sensitive data has been deleted)? All confidential patient information is not present. We have not had access to that information, following current legal regulations.
How should the data be interpreted? These data sets should not be separated from the main article in which they were utilized. Thus, to better understand their context, researchers should see them in the global scenario of our work.
Are there gaps in the data, or do they give a complete picture of the topic studied? As indicated above, data should be considered and interpreted in the global context of our study.
What processes have generated the data? The processes that generated the data are indicated in the summary of the data above and individually for each of them. Thus, each dataset is accompanied by a legend within the document.
What does the data measure in the columns of the files? As indicated, each dataset individually shows the information contained in the legend of each dataset.
What software is required to be able to read the data? The datasets are in Excel format.
How should the data be quoted? Researchers should cite the data in the context of the work they belong to once it is published and free of the embargo.
Can the data be reused? What use licenses are assigned to you? In principle, yes.
If additional clinical information is required, these data were previously published by some of us, and the references are included in our manuscript. These data are available from the principal investigators of the references listed in our work upon reasonable request.
Are there more versions of the data? Where? I do not think so beyond our files and copies.
Have the technical terms and acronyms referenced by the data been defined? A legend with the appropriate descriptions accompanies each dataset.
Have the geographic and chronological parameters of the data been qualified? The authors of the work have generated the data. Elsewhere, we indicate the authors of the work, their contributions, and affiliations.
Are keywords sufficiently data-specific? Are they based on any thesaurus? Keywords are based on our study. We include cardiotoxicity due to anthracyclines, missing heritability, subphenotype, pathophenotype, complex trait.
What is the name of the research project in which the data are framed? The main research project in which the data is prepared is:
Títle: "Chemotherapy cardiotoxicity in the elderly: a translational and personnel approach."
Ref.: PIE14/00066
Who has financed data production and management? Each of the authors of the study has its funding. The grants are included in the acknowledgments section of our manuscript.Here we present a series of supplemental datasets that complement our study entitled "A Systems Genetics approach to identify genetic markers of cardiotoxicity due to anthracyclines in cancer patients." The datasets presented here were used to generate the main and supplementary figures and tables of the indicated study.
The study consists of the identification of genetic markers of cardiotoxicity due to anthracyclines (CDA). CDA is a complex genesis disease or complex trait, and because of this, there is a component of missing heritability. Therefore, it is not possible to identify genetic markers associated with CDA risk. Here, we propose that molecular subphenotypes associated with the CDA may be a strategy for identifying some of this missing heritability and risk markers associated with it. A similar strategy could be applied to identify markers of other diseases of complex genesis.
This study is done using a genetically heterogeneous cohort of mice that developed breast cancer and was treated with doxorubicin or a combined treatment of doxorubicin and docetaxel. The mouse cohort was generated by backcrossing, so each mouse is genetically unique. Post-chemotherapy heart damage was assessed by quantifying fibrosis's cardiac area and the thickness of myocardial fibers. The genetic regions associated with CDA were assessed by massive genotyping and genetic linkage analysis. Several molecular subphenotypes were quantified in the myocardium, and their association with the CDA was evaluated.
Subsequently, we identified which of them were most statistically associated with CDA in multivariate models. Moreover, which complex trait loci (QTLs) associated with molecular subphenotypes best explained CDA. This strategy served to identify in the cohort of mice genes whose allelic forms could be candidates for the risk of CDA. Allelic variants of these genes were evaluated in four cohorts of cancer patients treated with anthracyclines and whose CDA was evaluated by echocardiography or cardiac magnetic resonance imaging (CMR).JPL laboratory was partially supported by the European Regional Development Fund (ERDF) and the Ministry of Science, Innovation, and Universities (SAF2014-56989-R, SAF2017-88854R), the Carlos III Health Institute (PIE14/00066), "Proyectos Integrados IBSAL 2015" (IBY15/00003), the Regional Government of Castile and Leon (CSI234P18), and "We can be heroes" Foundation. AGN laboratory and human patients' study are supported by funds from the ISCIII project grant (PI18/01242). The Human Genotyping unit is a member of CeGen, PRB3, and is supported by grant PT17/0019, of the PE I+D+i 2013-2016, funded by ISCIII and ERDF. SCLL was the recipient of a Ramón y Cajal research contract from the Spanish Ministry of Economy and Competitiveness, and the work was supported by MINECO/FEDER research grants (RTI2018-094130-B-100). The Proteomics Unit belongs to ProteoRed, PRB3-ISCIII, supported by grant PT17/0019/0023, of the PE I + D + I 2017-2020, funded by ISCIII and FEDER. RCC is funded by fellowships from the Spanish Regional Government of Castile and León. NGS is a recipient of an FPU fellowship (MINECO/FEDER). hiPSC-CM studies were funded in part by the "la Caixa" Banking Foundation under the project code HR18-00304" and Severo Ochoa CNIC Intramural Project (Expediente 12-2016 IGP) to JJ.Supplemental Dataset 1: CDA pathophenotypes after doxorubicin treatment. We treated 71 mice carrying breast cancer with doxorubicin. Each mouse was generated by backcrossing; thus, each one is genetically unique. Cardiotoxicity due to anthracyclines (CDA) was evaluated by automatically quantifying the heart fibrosis area and the average area of myocardial fibers as pathophenotypes of cardiotoxicity using the Ariol slide scanner. The histopathological damage was evaluated in the subendocardium and subepicardium from five randomly chosen regions of each sample (averages in μm2 are shown).--
Supplemental Dataset 2: CDA pathophenotypes after the combined therapy. We treated 61 mice carrying breast cancer with the combined therapy with doxorubicin and docetaxel. Each mouse was generated by backcrossing; thus, each one is genetically unique. Cardiotoxicity due to anthracyclines (CDA) was evaluated by automatically quantifying the heart fibrosis area and the average area of myocardial fibers as pathophenotypes of cardiotoxicity using the Ariol slide scanner. The histopathological damage was evaluated in the subendocardium and subepicardium from five randomly chosen regions of each sample (averages in μm2 are shown).--
Supplemental Dataset 3: CDA subphenotypes after doxorubicin therapy. Myocardium molecular subphenotypes after doxorubicin therapy. Proteins were quantified by a multiplex bead array (Luminex). TGFβ units are shown in pg/mL. The rest of the protein levels are shown in molecular fluorescence intensity (MFI) Units. The telomeric length was quantified by QPCR (RQ units). miRNAs were quantified by QPCR (RQ units). QPCR analyses were assessed by the ΔΔCT method; we show the averages of triplicates.--
Supplemental Dataset 4: CDA subphenotypes after the combined therapy. Myocardium molecular subphenotypes after the combined therapy with doxorubicin and docetaxel. Proteins were quantified by a multiplex bead array (Luminex). TGFβ units are shown in pg/mL. The rest of the protein levels are shown in molecular fluorescence intensity (MFI) Units. The telomeric length was quantified by QPCR (RQ units). miRNAs were quantified by QPCR (RQ units). QPCR analyses were assessed by the ΔΔCT method; we show the averages of triplicates.--
Supplemental Dataset 5: Correlations identified between molecular subphenotype levels in the myocardium and pathophenotypes of cardiotoxicity due to anthracyclines (CDA) after doxorubicin therapy in all mice.--
Supplemental Dataset 6: Correlations identified between molecular subphenotype levels in the myocardium and pathophenotypes of cardiotoxicity due to anthracyclines (CDA) after doxorubicin therapy in young mice. Correlation of Spearman.--
Supplemental Dataset 7: Correlations identified between molecular subphenotype levels in the myocardium and pathophenotypes of cardiotoxicity due to anthracyclines (CDA) after doxorubicin therapy in old mice. Correlation of Spearman.--
Supplemental Dataset 8: Correlations identified between molecular subphenotype levels in the myocardium and pathophenotypes of cardiotoxicity due to anthracyclines (CDA) after the combined therapy in all mice. Correlation of Spearman.--
Supplemental Dataset 9: Correlations identified between molecular subphenotype levels in the myocardium and pathophenotypes of cardiotoxicity due to anthracyclines (CDA) after the combined therapy in young mice. Correlation of Spearman.--
Supplemental Dataset 10: Correlations identified between molecular subphenotype levels in the myocardium and pathophenotypes of cardiotoxicity due to anthracyclines (CDA) after the combined therapy in old mice. Correlation of Spearman.--
Supplemental Dataset 11: Linkage analysis of molecular subphenotype levels quantified in the myocardium. Lod scores after doxorubicin therapy in all mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 12: Linkage analysis of molecular subphenotype levels quantified in the myocardium. Lod scores after doxorubicin therapy in young mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 13: Linkage analysis of molecular subphenotype levels quantified in the myocardium. Lod scores after doxorubicin therapy in old mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 14: Linkage analysis of molecular subphenotype levels quantified in the myocardium. Lod scores after the combined therapy in all mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 15: Linkage analysis of molecular subphenotype levels quantified in the myocardium. Lod scores after the combined therapy in young mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 16: Linkage analysis of molecular subphenotype levels quantified in the myocardium. Lod scores after the combined therapy in old mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 17: Massive genotyping of mouse cohort treated with doxorubicin. The genome-wide scan was carried out at the Spanish National Centre of Genotyping (CeGEN) at the Spanish National Cancer Research Centre (CNIO, Madrid, Spain). The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution.--
Supplemental Dataset 18: Massive genotyping of mouse cohort treated with the combined therapy. The genome-wide scan was carried out at the Spanish National Centre of Genotyping (CeGEN) at the Spanish National Cancer Research Centre (CNIO, Madrid, Spain). The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution.--
Supplemental Dataset 19: Linkage analysis of CDA pathophenotypes quantified in the myocardium. Lod scores after doxorubicin therapy in all mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 20: Linkage analysis of CDA pathophenotypes quantified in the myocardium. Lod scores after doxorubicin therapy in young mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 21: Linkage analysis of CDA pathophenotypes quantified in the myocardium. Lod scores after doxorubicin therapy in old mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 22: Linkage analysis of CDA pathophenotypes quantified in the myocardium. Lod scores after the combined therapy in all mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 23: Linkage analysis of CDA pathophenotypes quantified in the myocardium. Lod scores after the combined therapy in young mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 24: Linkage analysis of CDA pathophenotypes quantified in the myocardium. Lod scores after the combined therapy in old mice. The Illumina Mouse Medium Density Linkage Panel Assay was used to genotype 130 F1BX mice at 1449 single nucleotide polymorphisms (SNPs). Genotypes were classified as FVB/FVB (F/F) or FVB/C57BL/6 (F/B). Ultimately, 806 SNPs are informative from the FVB and C57BL/6 mice; the average genomic distance between these SNPs was 9.9 Mb. The genotype proportion among the F1BX mice showed a normal distribution. Linkage analysis was carried out using interval mapping with the expectation-maximization (EM) algorithm and R/QTL software. The criteria for significant and suggestive linkages for single markers were chosen based on Lander and Kruglyak (see methods section of our manuscript).--
Supplemental Dataset 25: Human breast cancer cohort-1 genotyping. The association of genetic variants with CDA was evaluated in four patient cohorts p
NEOTROPICAL CARNIVORES: a data set on carnivore distribution in the Neotropics
Mammalian carnivores are considered a key group in maintaining ecological health and can indicate potential ecological integrity in landscapes where they occur. Carnivores also hold high conservation value and their habitat requirements can guide management and conservation plans. The order Carnivora has 84 species from 8 families in the Neotropical region: Canidae; Felidae; Mephitidae; Mustelidae; Otariidae; Phocidae; Procyonidae; and Ursidae. Herein, we include published and unpublished data on native terrestrial Neotropical carnivores (Canidae; Felidae; Mephitidae; Mustelidae; Procyonidae; and Ursidae). NEOTROPICAL CARNIVORES is a publicly available data set that includes 99,605 data entries from 35,511 unique georeferenced coordinates. Detection/non-detection and quantitative data were obtained from 1818 to 2018 by researchers, governmental agencies, non-governmental organizations, and private consultants. Data were collected using several methods including camera trapping, museum collections, roadkill, line transect, and opportunistic records. Literature (peer-reviewed and grey literature) from Portuguese, Spanish and English were incorporated in this compilation. Most of the data set consists of detection data entries (n = 79,343; 79.7%) but also includes non-detection data (n = 20,262; 20.3%). Of those, 43.3% also include count data (n = 43,151). The information available in NEOTROPICAL CARNIVORES will contribute to macroecological, ecological, and conservation questions in multiple spatio-temporal perspectives. As carnivores play key roles in trophic interactions, a better understanding of their distribution and habitat requirements are essential to establish conservation management plans and safeguard the future ecological health of Neotropical ecosystems. Our data paper, combined with other large-scale data sets, has great potential to clarify species distribution and related ecological processes within the Neotropics. There are no copyright restrictions and no restriction for using data from this data paper, as long as the data paper is cited as the source of the information used. We also request that users inform us of how they intend to use the data
NEOTROPICAL XENARTHRANS: a data set of occurrence of xenarthran species in the Neotropics
Xenarthrans—anteaters, sloths, and armadillos—have essential functions for ecosystem maintenance, such as insect control and nutrient cycling, playing key roles as ecosystem engineers. Because of habitat loss and fragmentation, hunting pressure, and conflicts with domestic dogs, these species have been threatened locally, regionally, or even across their full distribution ranges. The Neotropics harbor 21 species of armadillos, 10 anteaters, and 6 sloths. Our data set includes the families Chlamyphoridae (13), Dasypodidae (7), Myrmecophagidae (3), Bradypodidae (4), and Megalonychidae (2). We have no occurrence data on Dasypus pilosus (Dasypodidae). Regarding Cyclopedidae, until recently, only one species was recognized, but new genetic studies have revealed that the group is represented by seven species. In this data paper, we compiled a total of 42,528 records of 31 species, represented by occurrence and quantitative data, totaling 24,847 unique georeferenced records. The geographic range is from the southern United States, Mexico, and Caribbean countries at the northern portion of the Neotropics, to the austral distribution in Argentina, Paraguay, Chile, and Uruguay. Regarding anteaters, Myrmecophaga tridactyla has the most records (n = 5,941), and Cyclopes sp. have the fewest (n = 240). The armadillo species with the most data is Dasypus novemcinctus (n = 11,588), and the fewest data are recorded for Calyptophractus retusus (n = 33). With regard to sloth species, Bradypus variegatus has the most records (n = 962), and Bradypus pygmaeus has the fewest (n = 12). Our main objective with Neotropical Xenarthrans is to make occurrence and quantitative data available to facilitate more ecological research, particularly if we integrate the xenarthran data with other data sets of Neotropical Series that will become available very soon (i.e., Neotropical Carnivores, Neotropical Invasive Mammals, and Neotropical Hunters and Dogs). Therefore, studies on trophic cascades, hunting pressure, habitat loss, fragmentation effects, species invasion, and climate change effects will be possible with the Neotropical Xenarthrans data set. Please cite this data paper when using its data in publications. We also request that researchers and teachers inform us of how they are using these data