5 research outputs found
A Diffusion-Model of Joint Interactive Navigation
Simulation of autonomous vehicle systems requires that simulated traffic
participants exhibit diverse and realistic behaviors. The use of prerecorded
real-world traffic scenarios in simulation ensures realism but the rarity of
safety critical events makes large scale collection of driving scenarios
expensive. In this paper, we present DJINN - a diffusion based method of
generating traffic scenarios. Our approach jointly diffuses the trajectories of
all agents, conditioned on a flexible set of state observations from the past,
present, or future. On popular trajectory forecasting datasets, we report state
of the art performance on joint trajectory metrics. In addition, we demonstrate
how DJINN flexibly enables direct test-time sampling from a variety of valuable
conditional distributions including goal-based sampling, behavior-class
sampling, and scenario editing.Comment: 10 pages, 4 figure
Video Killed the HD-Map: Predicting Driving Behavior Directly From Drone Images
The development of algorithms that learn behavioral driving models using
human demonstrations has led to increasingly realistic simulations. In general,
such models learn to jointly predict trajectories for all controlled agents by
exploiting road context information such as drivable lanes obtained from
manually annotated high-definition (HD) maps. Recent studies show that these
models can greatly benefit from increasing the amount of human data available
for training. However, the manual annotation of HD maps which is necessary for
every new location puts a bottleneck on efficiently scaling up human traffic
datasets. We propose a drone birdview image-based map (DBM) representation that
requires minimal annotation and provides rich road context information. We
evaluate multi-agent trajectory prediction using the DBM by incorporating it
into a differentiable driving simulator as an image-texture-based
differentiable rendering module. Our results demonstrate competitive
multi-agent trajectory prediction performance when using our DBM representation
as compared to models trained with rasterized HD maps
On Weighted Trigonometric Regression for Suboptimal Designs in Circadian Biology Studies
Studies in circadian biology often use trigonometric regression to model
phenomena over time. Ideally, protocols in these studies would collect samples
at evenly distributed and equally spaced time points over a 24 hour period.
This sample collection protocol is known as an equispaced design, which is
considered the optimal experimental design for trigonometric regression under
multiple statistical criteria. However, implementing equispaced designs in
studies involving individuals is logistically challenging, and failure to
employ an equispaced design could cause a loss of statistical power when
performing hypothesis tests with an estimated model. This paper is motivated by
the potential loss of statistical power during hypothesis testing, and
considers a weighted trigonometric regression as a remedy. Specifically, the
weights for this regression are the normalized reciprocals of estimates derived
from a kernel density estimator for sample collection time, which inflates the
weight of samples collected at underrepresented time points. A search procedure
is also introduced to identify the concentration hyperparameter for kernel
density estimation that maximizes the Hessian of weighted squared loss, which
relates to both maximizing the -optimality criterion from experimental
design literature and minimizing the generalized variance. Simulation studies
consistently demonstrate that this weighted regression mitigates variability in
inferences produced by an estimated model. Illustrations with three real
circadian biology data sets further indicate that this weighted regression
consistently yields larger test statistics than its unweighted counterpart for
first-order trigonometric regression, or cosinor regression, which is prevalent
in circadian biology studies
Critic Sequential Monte Carlo
We introduce CriticSMC, a new algorithm for planning as inference built from
a novel composition of sequential Monte Carlo with learned soft-Q function
heuristic factors. This algorithm is structured so as to allow using large
numbers of putative particles leading to efficient utilization of computational
resource and effective discovery of high reward trajectories even in
environments with difficult reward surfaces such as those arising from hard
constraints. Relative to prior art our approach is notably still compatible
with model-free reinforcement learning in the sense that the implicit policy we
produce can be used at test time in the absence of a world model. Our
experiments on self-driving car collision avoidance in simulation demonstrate
improvements against baselines in terms of infraction minimization relative to
computational effort while maintaining diversity and realism of found
trajectories.Comment: 20 pages, 3 figure
A Flexible Learning Infrastructure for Proteomics (ASMS 2017)
Developing a scoring model for MS/MS sequence-spectrum matches has been typically considered as part of the software development process and is rarely a user-serviceable component. Models are usually created from scratch and hard-coded for each project or fragmentation, which hinders adaptation and requires significant source code modifications to process new data types. In addition, advances such as increased resolving power and new dissociation methods result in new types of spectra where different fragment ions (e.g. due to neutral and side chain losses) can be considered for scoring enhancements. Here we introduce the Flexible Learning Infrastructure for Proteomics (FLIP). As in machine learning, FLIP conceives training/learning as an ongoing process, allowing rapid customizations over time.FLIP accepts as input raw MS/MS data, true-positive sequence matches, and true-negative sequence matches in community standard data formats for both spectra and identifications. A classifier is used to weight fragment ion features, such as mass error, isotopic fit, and intensity that best separate the true-positive and true-negative training data. By default, it supports both logistic regression and support vector machine models through the Accord.NET machine learning framework. Cross-validation feature reduction is used to select relevant fragment ions. The trained model is written to a tab-separated file, which serves as input for MS/MS scoring in a search engine. It also provides figures such as score histograms and ROC curves for performance evaluation of the trained model.The FLIP framework utilizes a modular design and the dependency injection software design pattern. It has been divided into five customizable core modules: parsing, for reading raw spectra and identifications; pre-processing, which performs deconvolution and spectrum filtering; modelling, which selects features from data for training; learning, which runs a machine learning classifier; and cross-validation. This allows users to substitute any of these modules with their own code without recompilation of the FLIP framework code and provides flexibility for new implementations such as support for their own data formats, custom features to train on, learning models, or cross validation metrics.<div><br></div><div>FLIP does not require the user to manually determine which type of fragment ions to train on. It starts with a very large set of possible fragment ions and performs multiple rounds of 10-fold cross-validation, each round reducing the number of fragment ions used for training. Area under the ROC curve is calculated each round to determine the optimal set of fragment ions.</div><div><br></div><div>Because our goal is to make this framework flexible to various types of data, we have used it to train scoring models on samples from peptides and intact proteins from both HCD and ultraviolet photodissociation (UVPD) dissociation methods. We created bottom up and top down MS/MS spectra on a Thermo QExactive for 6 bacterial organisms, which resulted in over 100,000 unique peptides for training and testing the scoring algorithms. The scoring models were evaluated with a Hela Lysate sample using our database search tool, MSPathFinder. We observed that FLIP was able to define effective models with significant differences in the numbers and types of fragment ions found for each of these data types, increasing the number of confident identifications (<1% false discovery rate) by at least 14% when compared to a peak counting scoring model. </div