252 research outputs found

    InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation

    Full text link
    Diffusion models have garnered considerable interest in the field of text generation. Several studies have explored text diffusion models with different structures and applied them to various tasks, including named entity recognition and summarization. However, there exists a notable disparity between the "easy-first" text generation process of current diffusion models and the "keyword-first" natural text generation process of humans, which has received limited attention. To bridge this gap, we propose InfoDiffusion, a non-autoregressive text diffusion model. Our approach introduces a "keyinfo-first" generation strategy and incorporates a noise schedule based on the amount of text information. In addition, InfoDiffusion combines self-conditioning with a newly proposed partially noising model structure. Experimental results show that InfoDiffusion outperforms the baseline model in terms of generation quality and diversity, as well as exhibiting higher sampling efficiency.Comment: EMNLP 2023 Finding

    Designing and Evaluating the MULTICOM Protein Local and Global Model Quality Prediction Methods in the CASP10 Experiment

    Get PDF
    Background: Protein model quality assessment is an essential component of generating and using protein structural models. During the Tenth Critical Assessment of Techniques for Protein Structure Prediction (CASP10), we developed and tested four automated methods (MULTICOM-REFINE, MULTICOM-CLUSTER, MULTICOM-NOVEL, and MULTICOM-CONSTRUCT) that predicted both local and global quality of protein structural models. Results: MULTICOM-REFINE was a clustering approach that used the average pairwise structural similarity between models to measure the global quality and the average Euclidean distance between a model and several top ranked models to measure the local quality. MULTICOM-CLUSTER and MULTICOM-NOVEL were two new support vector machine-based methods of predicting both the local and global quality of a single protein model. MULTICOM-CONSTRUCT was a new weighted pairwise model comparison (clustering) method that used the weighted average similarity between models in a pool to measure the global model quality. Our experiments showed that the pairwise model assessment methods worked better when a large portion of models in the pool were of good quality, whereas single-model quality assessment methods performed better on some hard targets when only a small portion of models in the pool were of reasonable quality. Conclusions: Since digging out a few good models from a large pool of low-quality models is a major challenge in protein structure prediction, single model quality assessment methods appear to be poised to make important contributions to protein structure modeling. The other interesting finding was that single-model quality assessment scores could be used to weight the models by the consensus pairwise model comparison method to improve its accuracy

    SMOQ: A Tool for Predicting the Absolute Residue-Specific Quality of a Single Protein Model with Support Vector Machine

    Get PDF
    Background: It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. Results: We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. Conclusions: SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/

    Brain glucose metabolism is associated with hormone level in Cushing's disease: A voxel-based study using FDG-PET

    Get PDF
    AbstractChronic exposure to elevated levels of glucocorticoids can exert a neurotoxic effect in patients, possibly manifesting as molecular imaging alterations in patients. The aim of this study was to investigate the potential association between brain metabolism and elevated hormone level using 18F-fluorodeoxyglucose positron emission tomography. We retrospectively enrolled 92 consecutive patients with confirmed diagnosis of Cushing's disease. A voxel-based analysis was performed to investigate the association between cerebral 18F-fluorodeoxyglucose uptake and serum cortisol level. Relatively impaired metabolism of specific brain regions correlated with serum cortisol level was found. Specifically, notable correlations were found in the hippocampus, amygdala, and cerebellum, regions considered to be involved in the regulation and central action of glucocorticoids. Moreover, some hormone-associated regions were found in the frontal and occipital cortex, possibly mediating the cognitive changes seen in Cushing's disease. Our findings link patterns of perturbed brain metabolism relates to individual hormone level, thus presenting a substrate for cognitive disturbances seen in Cushing's disease patients, as well as in other conditions with abnormal cortisol levels

    Rethinking Similarity Search: Embracing Smarter Mechanisms over Smarter Data

    Full text link
    In this vision paper, we propose a shift in perspective for improving the effectiveness of similarity search. Rather than focusing solely on enhancing the data quality, particularly machine learning-generated embeddings, we advocate for a more comprehensive approach that also enhances the underpinning search mechanisms. We highlight three novel avenues that call for a redefinition of the similarity search problem: exploiting implicit data structures and distributions, engaging users in an iterative feedback loop, and moving beyond a single query vector. These novel pathways have gained relevance in emerging applications such as large-scale language models, video clip retrieval, and data labeling. We discuss the corresponding research challenges posed by these new problem areas and share insights from our preliminary discoveries

    Clinical PET Imaging of Microglial Activation: Implications for Microglial Therapeutics in Alzheimer’s Disease

    Get PDF
    In addition to extracellular β-amyloid plaques and intracellular neurofibrillary tangles, neuroinflammation has been identified as a key pathological characteristic of Alzheimer’s disease (AD). Once activated, neuroinflammatory cells called microglia acquire different activation phenotypes. At the early stage of AD, activated microglia are mainly dominated by the neuroprotective and anti-inflammatory M2 phenotype. Conversely, in the later stage of AD, the excessive activation of microglia is considered detrimental and pro-inflammatory, turning into the M1 phenotype. Therapeutic strategies targeting the modulation of microglia may regulate their specific phenotype. Fortunately, with the rapid development of in vivo imaging methodologies, visualization of microglial activation has been well-explored. In this review, we summarize the critical role of activated microglia during the pathogenesis of AD and current studies concerning imaging of microglial activation in AD patients. We explore the possibilities for identifying activated microglial phenotypes with imaging techniques and highlight promising therapies that regulate the microglial phenotype in AD mice

    Linkage between surface energy balance non‐closure and horizontal asymmetric turbulent transport

    Get PDF
    A number of studies have reported that the traditional eddy covariance (EC) method generally underestimated vertical turbulent fluxes, leading to an outstanding non-closure problem of the surface energy balance (SEB). Although it is recognized that the enlarged surface energy imbalance frequently coincides with the increasing wind shear, the role of large eddies in affecting the SEB remains unclear. On analyzing data collected by an EC array, considerable horizontal inhomogeneity of kinematic heat flux is observed. The results show that the combined EC method that incorporates the spatial flux contribution increases the kinematic heat flux by 21% relative to the traditional EC method, improving the SEB closure. Additionally, spectral analysis indicates that large eddies with scales ranging from 0.0005 to 0.01 (in the normalized frequency) mainly account for the horizontal inhomogeneity of kinematic heat flux. Under unstable conditions, this process is operating upon large eddies characterized by enlarged asymmetric turbulent flux transport. With enhanced wind shear, the increment of flux contribution associated with sweeps and ejections becomes disproportionate, contributing to the horizontal inhomogeneity of kinematic heat flux, and thus may explain the increased SEB non-closure

    Targeted next-generation sequencing of dedifferentiated chondrosarcoma in the skull base reveals combined TP53 and PTEN mutations with increased proliferation index, an implication for pathogenesis

    Get PDF
    Dedifferentiated chondrosarcoma (DDCS) is a rare disease with a dismal prognosis. DDCS consists of two morphologically distinct components: the cartilaginous and noncartilaginous components. Whether the two components originate from the same progenitor cells has been controversial. Recurrent DDCS commonly displays increased proliferation compared with the primary tumor. However, there is no conclusive explanation for this mechanism. In this paper, we present two DDCSs in the sellar region. Patient 1 exclusively exhibited a noncartilaginous component with a TP53 frameshift mutation in the pathological specimens from the first surgery. The tumor recurred after radiation therapy with an exceedingly increased proliferation index. Targeted next-generation sequencing (NGS) revealed the presence of both a TP53 mutation and a PTEN deletion in the cartilaginous and the noncartilaginous components of the recurrent tumor. Fluorescence in situ hybridization and immunostaining confirmed reduced DNA copy number and protein levels of the PTEN gene as a result of the PTEN deletion. Patient 2 exhibited both cartilaginous and noncartilaginous components in the surgical specimens. Targeted NGS of cells from both components showed neither TP53 nor PTEN mutations, making Patient 2 a naïve TP53 and PTEN control for comparison. In conclusion, additional PTEN loss in the background of the TP53 mutation could be the cause of increased proliferation capacity in the recurrent tumor
    corecore