450 research outputs found
Interpretable correlation descriptors for quantitative structure-activity relationships
<p>Abstract</p> <p>Background</p> <p>The topological maximum cross correlation (TMACC) descriptors are alignment-independent 2D descriptors for the derivation of QSARs. TMACC descriptors are generated using atomic properties determined by molecular topology. Previous validation (<it>J Chem Inf Model </it>2007, <b>47</b>: 626-634) of the TMACC descriptor suggests it is competitive with the current state of the art.</p> <p>Results</p> <p>Here, we illustrate the interpretability of the TMACC descriptors, through the analysis of the QSARs of inhibitors of angiotensin converting enzyme (ACE) and dihydrofolate reductase (DHFR). In the case of the ACE inhibitors, the TMACC interpretation shows features specific to C-domain inhibition, which have not been explicitly identified in previous QSAR studies.</p> <p>Conclusions</p> <p>The TMACC interpretation can provide new insight into the structure-activity relationships studied. Freely available, open source software for generating the TMACC descriptors can be downloaded from <url>http://comp.chem.nottingham.ac.uk</url>.</p
A genome-wide study of Hardy–Weinberg equilibrium with next generation sequence data
Statistical tests for Hardy–Weinberg equilibrium have been an important tool for detecting genotyping errors in the past, and remain important in the quality control of next generation sequence data. In this paper, we analyze complete chromosomes of the 1000 genomes project by using exact test procedures for autosomal and X-chromosomal variants. We find that the rate of disequilibrium largely exceeds what might be expected by chance alone for all chromosomes. Observed disequilibrium is, in about 60% of the cases, due to heterozygote excess. We suggest that most excess disequilibrium can be explained by sequencing problems, and hypothesize mechanisms that can explain exceptional heterozygosities. We report higher rates of disequilibrium for the MHC region on chromosome 6, regions flanking centromeres and p-arms of acrocentric chromosomes. We also detected long-range haplotypes and areas with incidental high disequilibrium. We report disequilibrium to be related to read depth, with variants having extreme read depths being more likely to be out of equilibrium. Disequilibrium rates were found to be 11 times higher in segmental duplications and simple tandem repeat regions. The variants with significant disequilibrium are seen to be concentrated in these areas. For next generation sequence data, Hardy–Weinberg disequilibrium seems to be a major indicator for copy number variation.Peer ReviewedPostprint (published version
MegaMorph - multiwavelength measurement of galaxy structure: physically meaningful bulge-disc decomposition of galaxies near and far
Bulge–disc decomposition is a valuable tool for understanding galaxies. However, achieving robust measurements of component properties is difficult, even with high-quality imaging, and it becomes even more so with the imaging typical of large surveys. In this paper, we consider the advantages of a new, multiband approach to galaxy fitting. We perform automated bulge– disc decompositions for 163 nearby galaxies, by simultaneously fitting multiple images taken in five photometric filters. We show that we are able to recover structural measurements that agree well with various other works, and confirm a number of key results. We additionally use our results to illustrate the link between total Sersic index and bulge–disc structure, and demonstrate that the visually classification of lenticular galaxies is strongly dependent on the inclination of their disc component. By simulating the same set of galaxies as they would appear if observed at a range of redshifts, we are able to study the behaviour of bulge–disc decompositions as data quality diminishes. We examine how our multiband fits perform, and compare to the results of more conventional, single-band methods. Multiband fitting improves the measurement of all parameters, but particularly the bulge-to-total flux ratio and component colours. We therefore encourage the use of this approach with future surveys
- …