22 research outputs found
Data-Efficiency with a Single GPU: An Exploration of Transfer Methods for Small Language Models
Multi-task learning (MTL), instruction tuning, and prompting have recently
been shown to improve the generalizability of large language models to new
tasks. However, the benefits of such methods are less well-documented in
smaller language models, with some studies finding contradictory results. In
this work, we explore and isolate the effects of (i) model size, (ii) general
purpose MTL, (iii) in-domain MTL, (iv) instruction tuning, and (v) few-shot
fine-tuning for models with fewer than 500 million parameters. Our experiments
in the zero-shot setting demonstrate that models gain 31% relative improvement,
on average, from general purpose MTL, with an additional 37.6% relative gain
from in-domain MTL. Contradictory to prior works on large models, we find that
instruction tuning provides a modest 2% performance improvement for small
models
STOP: A dataset for Spoken Task Oriented Semantic Parsing
End-to-end spoken language understanding (SLU) predicts intent directly from
audio using a single model. It promises to improve the performance of assistant
systems by leveraging acoustic information lost in the intermediate textual
representation and preventing cascading errors from Automatic Speech
Recognition (ASR). Further, having one unified model has efficiency advantages
when deploying assistant systems on-device. However, the limited number of
public audio datasets with semantic parse labels hinders the research progress
in this area. In this paper, we release the Spoken Task-Oriented semantic
Parsing (STOP) dataset, the largest and most complex SLU dataset to be publicly
available. Additionally, we define low-resource splits to establish a benchmark
for improving SLU when limited labeled data is available. Furthermore, in
addition to the human-recorded audio, we are releasing a TTS-generated version
to benchmark the performance for low-resource domain adaptation of end-to-end
SLU systems. Initial experimentation show end-to-end SLU models performing
slightly worse than their cascaded counterparts, which we hope encourages
future work in this direction
DEVELOPMENT AND APPLICATION OF A REDUCED ORDER MATHEMATICAL FRAMEWORK TO UNRAVEL THE COMPLEXITY OF TRAUMA INDUCED COAGULOPATHY
Trauma is the leading cause of death and disability in United States for both children and adults. In response to trauma, the body unleashes a set of coupled programs that affect the functioning of vascular, immune and autonomous nervous systems. In pathological cases, the integrated output of these programs can result in coagulopathy, systemic in- flammatory response syndrome (SIRS), multiple organ dysfunction syndrome (MODS) and potentially even death. Nearly 35%-40% of trauma deaths occur due to uncontrolled hemorrhage resulting from trauma-induced coagulopathy (TIC). TIC also plays an impor- tant role in modulating inflammation, organ dysfunction and increased susceptibility to sepsis. Clinical trials for treatment strategies targeting TIC have met with limited success. The interlinked nature of coagulant and inflammatory responses, along with patient spe- cific physiological variability, make the treatment of TIC challenging. Understanding TIC requires an integrated multi-scale modeling framework which describes the relevant bio- chemical networks within the context of the whole-body. Given the complexity and size, embedding large, non-linear models of biochemical networks into a whole body model creates a significant computational challenge.
Thus an objective of this work is to develop a framework that reduces the complexity of high-dimensional mathematical models. We apply this framework to model biochem- ical networks that are important in TIC. We first investigate the dynamics of coagulation
and understand the impact of protein C pathway on thrombin generation. Thereafter we use this reduced order modeling technique to model complement and fibrinolysis. We identify targets of therapeutic importance in complement and mechanisms that control clot degradation in fibrinolysis. We show that we can capture the dynamics of these com- plex but varied systems using the reduced order modeling framework.
In addition, we address the problem of training high-dimensional, non-linear models of biological systems. Traditional gradient based methods often fail due to convergence to a local optima or due to the lack of gradient knowledge. We present a novel optimiza- tion method that is based on evolutionary algorithms to obtain near optimal parameters within a limited number of function evaluations. We demonstrate that this method ob- tains optimal solutions on a wide array of non-linear models, faster than existing meta heuristic methods. Taken together this work provides a methodology to rapidly investi- gate complex biochemical systems by simplifying the model design and experimentation processes
Dynamic Modeling of the Human Coagulation Cascade Using Reduced Order Effective Kinetic Models
In this study, we present a novel modeling approach which combines ordinary differential equation (ODE) modeling with logical rules to simulate an archetype biochemical network, the human coagulation cascade. The model consisted of five differential equations augmented with several logical rules describing regulatory connections between model components, and unmodeled interactions in the network. This formulation was more than an order of magnitude smaller than current coagulation models, because many of the mechanistic details of coagulation were encoded as logical rules. We estimated an ensemble of likely model parameters (N = 20) from in vitro extrinsic coagulation data sets, with and without inhibitors, by minimizing the residual between model simulations and experimental measurements using particle swarm optimization (PSO). Each parameter set in our ensemble corresponded to a unique particle in the PSO. We then validated the model ensemble using thrombin data sets that were not used during training. The ensemble predicted thrombin trajectories for conditions not used for model training, including thrombin generation for normal and hemophilic coagulation in the presence of platelets (a significant unmodeled component). We then used flux analysis to understand how the network operated in a variety of conditions, and global sensitivity analysis to identify which parameters controlled the performance of the network. Taken together, the hybrid approach produced a surprisingly predictive model given its small size, suggesting the proposed framework could also be used to dynamically model other biochemical networks, including intracellular metabolic networks, gene expression programs or potentially even cell free metabolic systems
Dynamic Modeling of Cell-Free Biochemical Networks Using Effective Kinetic Models
Cell-free systems offer many advantages for the study, manipulation and modeling of metabolism compared to in vivo processes. Many of the challenges confronting genome-scale kinetic modeling can potentially be overcome in a cell-free system. For example, there is no complex transcriptional regulation to consider, transient metabolic measurements are easier to obtain, and we no longer have to consider cell growth. Thus, cell-free operation holds several significant advantages for model development, identification and validation. Theoretically, genome-scale cell-free kinetic models may be possible for industrially important organisms, such as E. coli, if a simple, tractable framework for integrating allosteric regulation with enzyme kinetics can be formulated. Toward this unmet need, we present an effective biochemical network modeling framework for building dynamic cell-free metabolic models. The key innovation of our approach is the integration of simple effective rules encoding complex allosteric regulation with traditional kinetic pathway modeling. We tested our approach by modeling the time evolution of several hypothetical cell-free metabolic networks. We found that simple effective rules, when integrated with traditional enzyme kinetic expressions, captured complex allosteric patterns such as ultrasensitivity or non-competitive inhibition in the absence of mechanistic information. Second, when integrated into network models, these rules captured classic regulatory patterns such as product-induced feedback inhibition. Lastly, we showed, at least for the network architectures considered here, that we could simultaneously estimate kinetic parameters and allosteric connectivity from synthetic data starting from an unbiased collection of possible allosteric structures using particle swarm optimization. However, when starting with an initial population that was heavily enriched with incorrect structures, our particle swarm approach could converge to an incorrect structure. While only an initial proof-of-concept, the framework presented here could be an important first step toward genome-scale cell-free kinetic modeling of the biosynthetic capacity of industrially important organisms
Reduced order modeling and analysis of the human complement system
<div><p>Complement is an important pathway in innate immunity, inflammation, and many disease processes. However, despite its importance, there are few validated mathematical models of complement activation. In this study, we developed an ensemble of experimentally validated reduced order complement models. We combined ordinary differential equations with logical rules to produce a compact yet predictive model of complement activation. The model, which described the lectin and alternative pathways, was an order of magnitude smaller than comparable models in the literature. We estimated an ensemble of model parameters from <i>in vitro</i> dynamic measurements of the C3a and C5a complement proteins. Subsequently, we validated the model on unseen C3a and C5a measurements not used for model training. Despite its small size, the model was surprisingly predictive. Global sensitivity and robustness analysis suggested complement was robust to any single therapeutic intervention. Only the simultaneous knockdown of both C3 and C5 consistently reduced C3a and C5a formation from all pathways. Taken together, we developed a validated mathematical model of complement activation that was computationally inexpensive, and could easily be incorporated into pre-existing or new pharmacokinetic models of immune system function. The model described experimental data, and predicted the need for multiple points of therapeutic intervention to fully disrupt complement activation.</p></div