3,288 research outputs found

    DeFCoM: analysis and modeling of transcription factor binding sites using a motif-centric genomic footprinter

    Get PDF
    Identifying the locations of transcription factor binding sites is critical for understanding how gene transcription is regulated across different cell types and conditions. Chromatin accessibility experiments such as DNaseI sequencing (DNase-seq) and Assay for Transposase Accessible Chromatin sequencing (ATAC-seq) produce genome-wide data that include distinct "footprint" patterns at binding sites. Nearly all existing computational methods to detect footprints from these data assume that footprint signals are highly homogeneous across footprint sites. Additionally, a comprehensive and systematic comparison of footprinting methods for specifically identifying which motif sites for a specific factor are bound has not been performed. Using DNase-seq data from the ENCODE project, we show that a large degree of previously uncharacterized site-to-site variability exists in footprint signal across motif sites for a transcription factor. To model this heterogeneity in the data, we introduce a novel, supervised learning footprinter called DeFCoM (Detecting Footprints Containing Motifs). We compare DeFCoM to nine existing methods using evaluation sets from four human cell-lines and eighteen transcription factors and show that DeFCoM outperforms current methods in determining bound and unbound motif sites. We also analyze the impact of several biological and technical factors on the quality of footprint predictions to highlight important considerations when conducting footprint analyses and assessing the performance of footprint prediction methods. Lastly, we show that DeFCoM can detect footprints using ATAC-seq data with similar accuracy as when using DNase-seq data. Python code available at https://bitbucket.org/bryancquach/defcom CONTACT: [email protected] or [email protected] SUPPLEMENTARY INFORMATION: Supplementary information available at Bioinformatics online

    Rheumatoid Arthritis Naive T Cells Share Hypermethylation Sites With Synoviocytes.

    Get PDF
    ObjectiveTo determine whether differentially methylated CpGs in synovium-derived fibroblast-like synoviocytes (FLS) of patients with rheumatoid arthritis (RA) were also differentially methylated in RA peripheral blood (PB) samples.MethodsFor this study, 371 genome-wide DNA methylation profiles were measured using Illumina HumanMethylation450 BeadChips in PB samples from 63 patients with RA and 31 unaffected control subjects, specifically in the cell subsets of CD14+ monocytes, CD19+ B cells, CD4+ memory T cells, and CD4+ naive T cells.ResultsOf 5,532 hypermethylated FLS candidate CpGs, 1,056 were hypermethylated in CD4+ naive T cells from RA PB compared to control PB. In analyses of a second set of CpG candidates based on single-nucleotide polymorphisms from a genome-wide association study of RA, 1 significantly hypermethylated CpG in CD4+ memory T cells and 18 significant CpGs (6 hypomethylated, 12 hypermethylated) in CD4+ naive T cells were found. A prediction score based on the hypermethylated FLS candidates had an area under the curve of 0.73 for association with RA case status, which compared favorably to the association of RA with the HLA-DRB1 shared epitope risk allele and with a validated RA genetic risk score.ConclusionFLS-representative DNA methylation signatures derived from the PB may prove to be valuable biomarkers for the risk of RA or for disease status

    An evaluation of the status of living collections for plant, environmental, and microbial research

    Get PDF
    Citation: McCluskey, K., Parsons, J. P., Quach, K., & Duke, C. S. (2017). An evaluation of the status of living collections for plant, environmental, and microbial research. Journal of Biosciences, 42(2), 321-331. https://doi.org/10.1007/s12038-017-9685-6While living collections are critical for biological research, support for these foundational infrastructure elements is inconsistent, which makes quality control, regulatory compliance, and reproducibility difficult. In recent years, the Ecological Society of America has hosted several National Science Foundation–sponsored workshops to explore and enhance the sustainability of biological research infrastructure. At the same time, the United States Culture Collection Network has brought together managers of living collections to foster collaboration and information exchange within a specific living collections community. To assess the sustainability of collections, a survey was distributed to collection scientists whose responses provide a benchmark for evaluating the resiliency of these collections. Among the key observations were that plant collections have larger staffing requirements and that living microbe collections were the most vulnerable to retirements or other disruptions. Many higher plant and vertebrate collections have institutional support and several have endowments. Other collections depend on competitive grant support in an era of intense competition for these resources. Opportunities for synergy among living collections depend upon complementing the natural strong engagement with the research communities that depend on these collections with enhanced information sharing, communication, and collective action to keep them sustainable for the future. External efforts by funding agencies and publishers could reinforce the advantages of having professional management of research resources across every discipline. © 2017 Indian Academy of Science

    Using a Clinic-based Screening Tool for Primary Care Providers to Identify Commercially Sexually Exploited Children

    Get PDF
    Introduction: Commercial Sexual Exploitation of Children (CSEC), which encompasses acts of domestic minor sex trafficking, is a hidden problem in the U.S. that affects an estimated 300,000 children. Significant health impacts to victims include violence, substance abuse, mental illness, sexually transmitted diseases, and unintended pregnancy. However, due to the covert nature of sexual exploitation, the lack of understanding among service providers and law enforcement, and complex psychological factors experienced by victims, identifying CSEC is a tremendous challenge. Primary care providers can play a critical role in identifying CSEC victims within clinical settings to help address this silent epidemic. Objective: The goal of this project was to assess the prevalence of CSEC using a clinic-based screening tool within a community health center serving indigent populations, with a large proportion of the patients being of Asian and Pacific Islander descent. Methods: Medical charts were reviewed of young female patients (n=621) between 13-23 years of age and seeking clinical services in Asian Health Services’ Teen Clinic from 2008 through 2011, during the implementation of a clinic-based CSEC screening tool used by primary care providers. The CSEC screening tool consists of two questions about sexual exploitation. Results: Of the 621 patients in the study, 57.5% were Asian and Pacific Islander. Clinical providers applied the CSEC screening tool on 28.5% (n=177) of female patients in the study. Of the 177 patients who were screened, 7.3% (n=13) responded positive to questions about commercial sexual exploitation. Discussion: Using a clinic-based screening tool with patients who have identified risk factors helps primary care providers identify CSEC victims and link them to available resources. Under-reporting among victims and under-screening among providers remain major considerations in estimating CSEC prevalence. To address under-screening, it is important to raise awareness among primary care providers around the CSEC epidemic and their potential role for intervention, including screening for a history of sexual exploitation among youth patients

    Mechanical properties in crumple-formed paper derived materials subjected to compression

    Get PDF
    The crumpling of precursor materials to form dense three dimensional geometries offers an attractive route towards the utilisation of minor-value waste materials. Crumple-forming results in a mesostructured system in which mechanical properties of the material are governed by complex cross-scale deformation mechanisms. Here we investigate the physical and mechanical properties of dense compacted structures fabricated by the confined uniaxial compression of a cellulose tissue to yield crumpled mesostructuring. A total of 25 specimens of various densities were tested under compression. Crumple formed specimens exhibited densities in the range 0.8–1.3 g cm−3, and showed high strength to weight characteristics, achieving ultimate compressive strength values of up to 200 MPa under both quasi-static and high strain rate loading conditions and deformation energy that compares well to engineering materials of similar density. The materials fabricated in this work and their mechanical attributes demonstrate the potential of crumple-forming approaches in the fabrication of novel energy-absorbing materials from low-cost precursors such as recycled paper. Stiffness and toughness of the materials exhibit density dependence suggesting this forming technique further allows controllable impact energy dissipation rates in dynamic applications

    Conformal Language Modeling

    Full text link
    We propose a novel approach to conformal prediction for generative language models (LMs). Standard conformal prediction produces prediction sets -- in place of single predictions -- that have rigorous, statistical performance guarantees. LM responses are typically sampled from the model's predicted distribution over the large, combinatorial output space of natural language. Translating this process to conformal prediction, we calibrate a stopping rule for sampling different outputs from the LM that get added to a growing set of candidates until we are confident that the output set is sufficient. Since some samples may be low-quality, we also simultaneously calibrate and apply a rejection rule for removing candidates from the output set to reduce noise. Similar to conformal prediction, we prove that the sampled set returned by our procedure contains at least one acceptable answer with high probability, while still being empirically precise (i.e., small) on average. Furthermore, within this set of candidate responses, we show that we can also accurately identify subsets of individual components -- such as phrases or sentences -- that are each independently correct (e.g., that are not "hallucinations"), again with statistical guarantees. We demonstrate the promise of our approach on multiple tasks in open-domain question answering, text summarization, and radiology report generation using different LM variants

    Genome-wide analysis of transposon and retroviral insertions reveals preferential integrations in regions of DNA flexibility

    Get PDF
    DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germline transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish, with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence integration of heterologous DNA in genomes, and have implications for targeted genome engineering
    • …
    corecore