19 research outputs found
Visualizing research impact through citation data
Research impact plays a critical role in evaluating the research quality and influence of a scholar, a journal, or a conference. Many researchers have attempted to quantify research impact by introducing different types of metrics based on citation data, such as
h
-index, citation count, and impact factor. These metrics are widely used in the academic community. However, quantitative metrics are highly aggregated in most cases and sometimes biased, which probably results in the loss of impact details that are important for comprehensively understanding research impact. For example, which research area does a researcher have great research impact on? How does the research impact change over time? How do the collaborators take effect on the research impact of an individual? Simple quantitative metrics can hardly help answer such kind of questions, since more detailed exploration of the citation data is needed. Previous work on visualizing citation data usually only shows limited aspects of research impact and may suffer from other problems including visual clutter and scalability issues. To fill this gap, we propose an interactive visualization tool,
ImpactVis
, for better exploration of research impact through citation data. Case studies and in-depth expert interviews are conducted to demonstrate the effectiveness of
ImpactVis
.
</jats:p
Mixed Distillation Helps Smaller Language Model Better Reasoning
While large language models (LLMs) have demonstrated exceptional performance
in recent natural language processing (NLP) tasks, their deployment poses
substantial challenges due to high computational and memory demands in
real-world applications. Recent studies have focused on enhancing smaller
models through knowledge distillation from LLMs, yielding promising results.
However, these models often struggle to match the performance of LLMs,
especially in tasks that require reasoning. In this work, we introduce Mixed
Distillation (MD) framework, which capitalizes on the strengths of Program of
Thought (PoT) and Chain of Thought (CoT) capabilities within LLMs, combining
multiple prompting techniques and distilling these capabilities into smaller
models. Our experimental results show that MD significantly enhances the
single-path and multi-path reasoning ability of smaller models in various
tasks. In terms of accuracy and generality of reasoning tasks, the model
generated by it exceeds the comprehensive performance of two individually
distilled models. Notably, LLaMA2-7B and CodeLlama-7B using MD achieved
remarkable improvements of (84.5%) and (85.5%), respectively, outperforming
GPT-3.5-Turbo by (2.5%) and (3.5%), on the SVAMP benchmark.Comment: Working in Progress, 17 pages, 16 figure
Suppression of ILC2 differentiation from committed T cell precursors by E protein transcription factors
Current models propose that group 2 innate lymphoid cells (ILC2s) are generated in the bone marrow. Here, we demonstrate that subsets of these cells can differentiate from multipotent progenitors and committed T cell precursors in the thymus, both in vivo and in vitro. These thymic ILC2s exit the thymus, circulate in the blood, and home to peripheral tissues. Ablation of E protein transcription factors greatly promotes the ILC fate while impairing B and T cell development. Consistently, a transcriptional network centered on the ZBTB16 transcription factor and IL-4 signaling pathway is highly up-regulated due to E protein deficiency. Our results show that ILC2 can still arise from what are normally considered to be committed T cell precursors, and that this alternative cell fate is restrained by high levels of E protein activity in these cells. Thymus-derived lung ILC2s of E protein-deficient mice show different transcriptomes, proliferative properties, and cytokine responses from wild-type counterparts, suggesting potentially distinct functions
Human DNA Exonuclease TREX1 Is Also an Exoribonuclease That Acts on Single-Stranded RNA
3\u27 repair exonuclease 1 (TREX1) is a known DNA exonuclease involved in autoimmune disorders and the antiviral response. In this work, we show that TREX1 is also a RNA exonuclease. Purified TREX1 displays robust exoribonuclease activity that degrades single-stranded, but not double-stranded, RNA. TREX1-D200N, an Aicardi-Goutieres syndrome disease-causing mutant, is defective in degrading RNA. TREX1 activity is strongly inhibited by a stretch of pyrimidine residues as is a bacterial homolog, RNase T. Kinetic measurements indicate that the apparent Km of TREX1 for RNA is higher than that for DNA. Like RNase T, human TREX1 is active in degrading native tRNA substrates. Previously reported TREX1 crystal structures have revealed that the substrate binding sites are open enough to accommodate the extra hydroxyl group in RNA, further supporting our conclusion that TREX1 acts on RNA. These findings indicate that its RNase activity needs to be taken into account when evaluating the physiological role of TREX1
Recommended from our members
Generalized Born Models in Calculations of Host-Guest Binding Affinity
Binding free energy determines binding affinity and therefore is central to the design of ligands to bind specific proteins with high affinity. While ever-growing experimental technologies can yield reliable binding affinities, investment of experiments is usually costly. Free energy methods with computer modeling are powerful tools to predict binding affinity and guide directions for experimental input. However, computational methods still are not accurate enough to rely on their own, and sometimes can be computationally demanding. In this thesis, I focused on how to calculate the binding affinity with a fast and accurate approach. I started by evaluating the performance of fast Generalized Born (GB) implicit solvent models in terms of binding free energy calculations with host-guest systems that capture most of the intramolecular and intermolecular interactions of protein-ligand binding complexes. Then, the results from these calculations motivated the second study to explore the possibility of improving the accuracy of such calculations by small adjustments in GB parameters. As a result, this work guides directions for fast and accurate free energy calculations and potential optimization of GB models
Recommended from our members
Generalized Born Models in Calculations of Host-Guest Binding Affinity
Binding free energy determines binding affinity and therefore is central to the design of ligands to bind specific proteins with high affinity. While ever-growing experimental technologies can yield reliable binding affinities, investment of experiments is usually costly. Free energy methods with computer modeling are powerful tools to predict binding affinity and guide directions for experimental input. However, computational methods still are not accurate enough to rely on their own, and sometimes can be computationally demanding. In this thesis, I focused on how to calculate the binding affinity with a fast and accurate approach. I started by evaluating the performance of fast Generalized Born (GB) implicit solvent models in terms of binding free energy calculations with host-guest systems that capture most of the intramolecular and intermolecular interactions of protein-ligand binding complexes. Then, the results from these calculations motivated the second study to explore the possibility of improving the accuracy of such calculations by small adjustments in GB parameters. As a result, this work guides directions for fast and accurate free energy calculations and potential optimization of GB models
A fast, convenient, polarizable electrostatic model for molecular dynamics
We present an efficient polarizable electrostatic model, utilizing typed, atom-centered, polarizabilities and the fast direct approximation, designed for efficient use in molecular dynamics (MD) simulations. The model provides two convenient approaches to assigning partial charges in the context of the atomic polarizabilities. One is a generalization of RESP, called RESP-dPol, and the other, AM1-BCC-dPol, is an adaptation of the widely used AM1-BCC method. Both are designed to accurately replicate gas-phase QM electrostatic potentials. Benchmarks of this polarizable electrostatic model against gas-phase dipole moments, molecular polarizabilities, bulk liquid densities, and static dielectric constants of organic liquids, show good agreement with the reference values. Of note, the model yields markedly more accurate dielectric constants of organic liquids, relative to a matched non-polarizable force field. MD simulations with this method, which is currently parameterized for molecules containing elements C, N, O, and H, run about only 3.6-fold slower than fixed charge force fields, while simulations with the self-consistent mutual polarization average 4.5-fold slower. Our results suggest that RESP-dPol and AM1-BCC-dPol afford improved accuracy, relative to fixed charge force fields, and are good starting points for developing general, affordable, and transferable polarizable force fields. The software implementing these approaches has been designed to utilize the force field fitting frameworks developed and maintained by Open Force Field Initiative, setting the stage for further exploration of this approach to polarizable force field development
Recommended from our members
A Fast, Convenient, Polarizable Electrostatic Model for Molecular Dynamics
We present an efficient polarizable electrostatic model, utilizing typed, atom-centered polarizabilities and the fast direct approximation, designed for efficient use in molecular dynamics (MD) simulations. The model provides two convenient approaches for assigning partial charges in the context of atomic polarizabilities. One is a generalization of RESP, called RESP-dPol, and the other, AM1-BCC-dPol, is an adaptation of the widely used AM1-BCC method. Both are designed to accurately replicate gas-phase quantum mechanical electrostatic potentials. Benchmarks of this polarizable electrostatic model against gas-phase dipole moments, molecular polarizabilities, bulk liquid densities, and static dielectric constants of organic liquids show good agreement with the reference values. Of note, the model yields markedly more accurate dielectric constants of organic liquids, relative to a matched nonpolarizable force field. MD simulations with this method, which is currently parametrized for molecules containing elements C, N, O, and H, run only about 3.6-fold slower than fixed charge force fields, while simulations with the self-consistent mutual polarization average 4.5-fold slower. Our results suggest that RESP-dPol and AM1-BCC-dPol afford improved accuracy relative to fixed charge force fields and are good starting points for developing general, affordable, and transferable polarizable force fields. The software implementing these approaches has been designed to utilize the force field fitting frameworks developed and maintained by the Open Force Field Initiative, setting the stage for further exploration of this approach to polarizable force field development