86 research outputs found
Recommended from our members
Computational genomics and genetics of developmental disorders
Computational genomics is at the intersection of computational applied physics, math, statistics, computer science and biology. With the advances in sequencing technology, large amounts of comprehensive genomic data are generated every year. However, the nature of genomic data is messy, complex and unstructured; it becomes extremely challenging to explore, analyze and understand the data based on traditional methods. The needs to develop new quantitative methods to analyze large-scale genomics datasets are urgent. By collecting, processing and organizing clean genomics datasets and using these datasets to extract insights and relevant information, we are able to develop novel methods and strategies to address specific genetics questions using the tools of applied mathematics, statistics, and human genetics.
This thesis describes genetic and bioinformatics studies focused on utilizing and developing state-of-the-art computational methods and strategies in order to identify and interpret de novo mutations that are likely causing developmental disorders. We performed whole exome sequencing as well as whole genome sequencing on congenital diaphragmatic hernia parents-child trios and identified a new candidate risk gene MYRF. Additionally, we found male and female patients carry a different burden of likely-gene- disrupting mutations, and isolated and complex patients carry different gene expression levels in early development of diaphragm tissues for likely-gene-disrupting mutations.
To increase the power to detect risk genes and risk variants, we developed a deep neural network classifier called MVP to accurately predict the pathogenicity of missense variants. MVP implemented an advanced structure of ResNet model and based on two independent data sets, MVP achieved clearly better results in prioritizing pathogenic variants than other methods. Additionally, we studied the genetic connection between developmental disorders and cancer. We found that in developmental disorder patients predicted deleterious de novo mutations are more enriched in cancer driver genes than non cancer driver genes. A Hidden Markov Model was implemented to discover cancer somatic missense mutation hotspots and we demonstrated many cancer driver genes shared a similar mode of action in developmental disorders and caner. By improving ability to interpret missense mutations and leveraging cancer genomics data, we can improve risk gene inference in developmental disorders
Recommended from our members
CFHMM: Heterogeneous Tumor CNV Classification by Hidden Markov
We here develop and implement a Clonal Fraction Hidden Markov Model (CFHMM), to leverage positional information in classifying Tumor CNVs and their corresponding clonal fraction from log-ratio-normalized Tumor/Normal sequencing data. In simulated data, this approach shows accurate calling of CNVs for high-fraction mutations, and improvement in calling over a naïve clustering benchmark across the board, as well as useful purity estimation for dominant clones.
Transfer hydrogenation of aldehydes catalyzed by silyl hydrido iron complexes bearing a [PSiP] pincer ligand
The synthesis and characterization of a series of silyl hydrido iron complexes bearing a pincer-type [PSiP] ligand (2-R(2)PC(6)H(4))(2)SiH(2) (R = Ph (1) and (i)Pr (5)) or (2-Ph(2)PC(6)H(4))(2)SiMeH (2) were reported. Preligand 1 reacted with Fe(PMe(3))(4) to afford complex ((2-Ph(2)PC(6)H(4))SiH)Fe(H)(PMe(3))(2) (3) in toluene, which was structurally characterized by X-ray diffraction. ((2-(i)Pr(2)PC(6)H(4))SiH)Fe(H)(PMe(3)) (6) could be obtained from the reaction of preligand 5 with Fe(PMe(3))(4) in toluene. Furthermore, complex ((2-(i)Pr(2)PC(6)H(4))Si(OMe))Fe(H)(PMe(3)) (7) was isolated by the reaction of complex 6 with 2 equiv. MeOH in THF. The molecular structure of complex 7 was also determined by single-crystal X-ray analysis. Complexes 3, 4, 6 and 7 showed good to excellent catalytic activity for transfer hydrogenation of aldehydes under mild conditions, using 2-propanol as both solvent and hydrogen donor. α,β-Unsaturated aldehydes could be selectively reduced to corresponding α,β-unsaturated alcohols. The catalytic activity of penta-coordinate complex 6 or 7 is stronger than that of hexa-coordinate complex 3 or 4
DyTed: Disentangled Representation Learning for Discrete-time Dynamic Graph
Unsupervised representation learning for dynamic graphs has attracted a lot
of research attention in recent years. Compared with static graph, the dynamic
graph is a comprehensive embodiment of both the intrinsic stable
characteristics of nodes and the time-related dynamic preference. However,
existing methods generally mix these two types of information into a single
representation space, which may lead to poor explanation, less robustness, and
a limited ability when applied to different downstream tasks. To solve the
above problems, in this paper, we propose a novel disenTangled representation
learning framework for discrete-time Dynamic graphs, namely DyTed. We specially
design a temporal-clips contrastive learning task together with a structure
contrastive learning to effectively identify the time-invariant and
time-varying representations respectively. To further enhance the
disentanglement of these two types of representation, we propose a
disentanglement-aware discriminator under an adversarial learning framework
from the perspective of information theory. Extensive experiments on Tencent
and five commonly used public datasets demonstrate that DyTed, as a general
framework that can be applied to existing methods, achieves state-of-the-art
performance on various downstream tasks, as well as be more robust against
noise
The effects of solvent extraction on nanoporosity of marine-continental coal and mudstone
Coal and organic-rich mudstone develop massive nanopores, which control the storage of adsorbed and free gas, as well as fluid flows. Generation and retention of bitumen and hydrocarbons of oil window reservoirs add more uncertainty to the nanoporosity. Solvent extraction is a traditional way to regain unobstructed pore networks but may cause additional effects due to interactions with rocks, such as solvent adsorbing on clay surfaces or absorbing in kerogens. Selected marine-continental coal and mudstone in Eastern Ordos Basin were studied to investigate how pore structures are affected by these in-situ-sorptive compounds (namely residual bitumen and hydrocarbons) and altered by solvent extractions. Solvent extraction was performed to obtain bitumen-free subsamples. Organic petrology, bulk geochemical analyses and gas chromatography were used to characterize the samples and the extracts. Low-pressure argon and carbon dioxide adsorptions were utilized to characterize the nanopore structures of the samples before and after extraction. The samples, both coal and mudstone, are in oil windows, with vitrinite reflectance ranging from 0.807 to 1.135%. The coals are strongly affected by marine organic input, except for the sample C-4; the mudstones are sourced by either marine or terrestrial organic input, or their mixture. As for the coals affected by marine organic input, residual bitumen and hydrocarbons occupying or blocking pores <10 nm becomes weak with thermal maturation. Bitumen derived from terrestrial organic matter mainly affects small pores, since coal asphaltene molecules are much smaller than petroleum asphaltene molecules. The mudstone M-2 with high extract production showed an increase of nanopores after extraction, due to the exposure of the filled or blocked pores. However, most transitional mudstones saw decreases of the pores because pore shrinkage caused by solvents adsorbing on and swelling clay minerals (mainly kaolinite and illite/smectite mixed layers) counteracts the released pore spaces. Solvent extractions on the coals significantly increased the micropores <0.6 nm, since the heat of sorption of alkanes reaches the peak in the pores within 0.4–0.5 nm. By contrast, solvent extractions on the mudstones decreased the micropores ∼0.35 nm, which is perhaps caused by evaporative drying of solvent displacing residual water in clay
Synthesis and Catalytic Activity of Iron Hydride Ligated with Bidentate N-Heterocyclic Silylenes for Hydroboration of Carbonyl Compounds
We report the synthesis
of a novel bidentate N-heterocyclic silylene
(NHSi) ligand, N-(LSi:)-N-methyl-2-pyridinamine
(1) (L = PhCÂ(NtBu)2), and
the first bischelate disilylene iron hydride, [(Si,N)Â(Si,C)ÂFeÂ(H)Â(PMe3)] (2), and monosilylene iron hydride, [(Si,C)ÂFeÂ(H)Â(PMe3)3] (2′), through Csp2–H activation of the NHSi ligand. Compounds 1 and 2 were fully characterized by spectroscopic
methods and single-crystal X-ray diffraction analysis. Density functional
theory calculations indicated the multiple-bond character of the Fe–Si
bonds and the Ï€ back-donation from FeÂ(II) to the SiÂ(II) center.
Moreover, the strong donor character of ligand 1 enables 2 to act as an efficient catalyst for the hydroboration reaction
of carbonyl compounds at room temperature. Chemoselective hydroboration
is attained under these conditions. This might be the first example
of hydroboration of ketones and aldehydes catalyzed by a silylene
hydrido iron complex. A catalytic mechanism was suggested and partially
experimentally verified
Organic matter provenance and depositional environment of marine-to-continental mudstones and coals in eastern Ordos Basin, China—Evidence from molecular geochemistry and petrology
Cyclothems, composed of interbedded mudstone, coal and sandstone layers, make up the Taiyuan and Shanxi Formations in the Late Carboniferous to Early Permian in North China under a marine-to-continental depositional environment. The cyclothems act as important fossil energy hosts, such as coalbeds, hydrocarbon source rocks and unconventional natural gas reservoirs. Organic geochemistry and petrology of mudstones and coals in the Taiyuan and Shanxi Formations in the eastern Ordos Basin were studied to reveal the organic matter sources and paleoenvironments. Total organic carbon (TOC) contents vary from 1.1 wt% (mudstone) to 72.6 wt% (coal). The samples are mainly within the oil window, with the Tmax values ranging from 433 to 469 °C. Organic petrology and source biomarkers indicate that the mudstones were sourced from a mixed organic matter input, and terrigenous organic matter predominates over aquatic organic matter. The coals are mostly sourced by terrigenous organic matter inputs. High concentrations of hopanes argue for a strong bacterial input. Some m/z 217 mass chromatograms have peaks at the hopanes' retention times as a result of high hopane to sterane ratios. These hopane-derived peaks do not interfere the identification of the steranes because the hopanes and the steranes have different retention times. Maturity-dependent biomarkers demonstrate that the samples have been thermally mature, which agree with the Tmax values. Anomalously low C29 20S/(20S + 20R) and C29 ββ/(ββ + αα) sterane ratios are present in all the samples, and are interpreted as due to the terrigenous organic matter input or the coal-related depositional environment. In addition, biomarkers and iron sulfide morphology indicate that the organic matter of the mudstones deposited in a proximal setting with shallow, brackish/fresh water bodies. With consideration of preservation of organic matter, the redox conditions are dysoxic. Redox oscillations resulted in the records of oxic conditions in some samples. Finally, the coals and the mudstones mainly generate gas and have poor oil generative potential
GW26-e2938 Research on pharmacological mechanisms of Qishen Granule using Methodology 1H-NMR Metabolomics in mini pigs with cardiac functional insufficiency and qi-deficiency and blood stasis syndrome(QDBS) induced by Ameroid constricting ring
Recommended from our members
EM-mosaic detects mosaic point mutations that contribute to congenital heart disease.
BackgroundThe contribution of somatic mosaicism, or genetic mutations arising after oocyte fertilization, to congenital heart disease (CHD) is not well understood. Further, the relationship between mosaicism in blood and cardiovascular tissue has not been determined.MethodsWe developed a new computational method, EM-mosaic (Expectation-Maximization-based detection of mosaicism), to analyze mosaicism in exome sequences derived primarily from blood DNA of 2530 CHD proband-parent trios. To optimize this method, we measured mosaic detection power as a function of sequencing depth. In parallel, we analyzed our cohort using MosaicHunter, a Bayesian genotyping algorithm-based mosaic detection tool, and compared the two methods. The accuracy of these mosaic variant detection algorithms was assessed using an independent resequencing method. We then applied both methods to detect mosaicism in cardiac tissue-derived exome sequences of 66 participants for which matched blood and heart tissue was available.ResultsEM-mosaic detected 326 mosaic mutations in blood and/or cardiac tissue DNA. Of the 309 detected in blood DNA, 85/97 (88%) tested were independently confirmed, while 7/17 (41%) candidates of 17 detected in cardiac tissue were confirmed. MosaicHunter detected an additional 64 mosaics, of which 23/46 (50%) among 58 candidates from blood and 4/6 (67%) of 6 candidates from cardiac tissue confirmed. Twenty-five mosaic variants altered CHD-risk genes, affecting 1% of our cohort. Of these 25, 22/22 candidates tested were confirmed. Variants predicted as damaging had higher variant allele fraction than benign variants, suggesting a role in CHD. The estimated true frequency of mosaic variants above 10% mosaicism was 0.14/person in blood and 0.21/person in cardiac tissue. Analysis of 66 individuals with matched cardiac tissue available revealed both tissue-specific and shared mosaicism, with shared mosaics generally having higher allele fraction.ConclusionsWe estimate that ~ 1% of CHD probands have a mosaic variant detectable in blood that could contribute to cardiac malformations, particularly those damaging variants with relatively higher allele fraction. Although blood is a readily available DNA source, cardiac tissues analyzed contributed ~ 5% of somatic mosaic variants identified, indicating the value of tissue mosaicism analyses
- …