3,765 research outputs found
A New Step-down Procedure for Simultaneous Hypothesis Testing Under Dependence
In this article, we consider the problem of simultaneous testing of
hypotheses when the individual test statistics are not necessarily independent.
Specifically, we consider the problem of simultaneous testing of point null
hypotheses against two-sided alternatives for the mean parameters of normally
distributed random variables. We assume that conditionally given the vector of
means, these random variables jointly follow a multivariate normal distribution
with a known but arbitrary covariance matrix. We consider a Bayesian framework
where each unknown mean parameter is modeled through a two-component "spike and
slab" mixture prior. This way, unconditionally the test statistics jointly have
a mixture of multivariate normal distributions. A new testing procedure is
developed that uses the dependence among the test statistics and works in a
"step-down" manner. This procedure is general enough to be applied for
non-normal data. A decision theoretic justification in favor of the proposed
testing procedure has been provided by showing that unlike many traditional
p-value based stepwise procedures, this new method possesses a certain
"convexity property" which makes it admissible with respect to a vector risk
function that captures the risks for the individual testing problems. An
alternative representation of the proposed test statistics has also been
established resulting in great simplification in the computational complexity.
It is demonstrated through extensive simulations that for various forms of
dependence and a wide range of sparsity levels, the proposed testing procedure
compares quite favorably with several existing multiple testing procedures
available in the literature in terms of overall misclassification probability
Fast search for Dirichlet process mixture models
Dirichlet process (DP) mixture models provide a flexible Bayesian framework
for density estimation. Unfortunately, their flexibility comes at a cost:
inference in DP mixture models is computationally expensive, even when
conjugate distributions are used. In the common case when one seeks only a
maximum a posteriori assignment of data points to clusters, we show that search
algorithms provide a practical alternative to expensive MCMC and variational
techniques. When a true posterior sample is desired, the solution found by
search can serve as a good initializer for MCMC. Experimental results show that
using these techniques is it possible to apply DP mixture models to very large
data sets
A Visual Interaction Framework for Dimensionality Reduction Based Data Exploration
Dimensionality reduction is a common method for analyzing and visualizing
high-dimensional data. However, reasoning dynamically about the results of a
dimensionality reduction is difficult. Dimensionality-reduction algorithms use
complex optimizations to reduce the number of dimensions of a dataset, but
these new dimensions often lack a clear relation to the initial data
dimensions, thus making them difficult to interpret. Here we propose a visual
interaction framework to improve dimensionality-reduction based exploratory
data analysis. We introduce two interaction techniques, forward projection and
backward projection, for dynamically reasoning about dimensionally reduced
data. We also contribute two visualization techniques, prolines and feasibility
maps, to facilitate the effective use of the proposed interactions. We apply
our framework to PCA and autoencoder-based dimensionality reductions. Through
data-exploration examples, we demonstrate how our visual interactions can
improve the use of dimensionality reduction in exploratory data analysis.Comment: CHI'18. arXiv admin note: text overlap with arXiv:1707.0428
An Algorithm for Supervised Driving of Cooperative Semi-Autonomous Vehicles (Extended)
Before reaching full autonomy, vehicles will gradually be equipped with more
and more advanced driver assistance systems (ADAS), effectively rendering them
semi-autonomous. However, current ADAS technologies seem unable to handle
complex traffic situations, notably when dealing with vehicles arriving from
the sides, either at intersections or when merging on highways. The high rate
of accidents in these settings prove that they constitute difficult driving
situations. Moreover, intersections and merging lanes are often the source of
important traffic congestion and, sometimes, deadlocks. In this article, we
propose a cooperative framework to safely coordinate semi-autonomous vehicles
in such settings, removing the risk of collision or deadlocks while remaining
compatible with human driving. More specifically, we present a supervised
coordination scheme that overrides control inputs from human drivers when they
would result in an unsafe or blocked situation. To avoid unnecessary
intervention and remain compatible with human driving, overriding only occurs
when collisions or deadlocks are imminent. In this case, safe overriding
controls are chosen while ensuring they deviate minimally from those originally
requested by the drivers. Simulation results based on a realistic physics
simulator show that our approach is scalable to real-world scenarios, and
computations can be performed in real-time on a standard computer for up to a
dozen simultaneous vehicles
Precautionary Effect and Variations of the Value of Information
For a sequential, two-period decision problem with uncertainty and under broad conditions (non-finite sample set, endogenous risk, active learning and stochastic dynamics), a general sufficient condition is provided to compare the optimal initial decisions with or without information arrival in the second period. More generally the condition enables the comparison of optimal decisions related to different information structures. It also ties together and clarifies many conditions for the so-called irreversibility effect that are scattered in the environmental economics literature. A numerical illustration with an integrated assessment model of climate-change economics is provided.Value of Information, Uncertainty, Irreversibility effect, Climate change
A Reactive Tabu Search Algorithm for Stimuli Generation in Psycholinguistics
The generation of meaningless "words" matching certain statistical and/or
linguistic criteria is frequently needed for experimental purposes in
Psycholinguistics. Such stimuli receive the name of pseudowords or nonwords in
the Cognitive Neuroscience literatue. The process for building nonwords
sometimes has to be based on linguistic units such as syllables or morphemes,
resulting in a numerical explosion of combinations when the size of the
nonwords is increased. In this paper, a reactive tabu search scheme is proposed
to generate nonwords of variables size. The approach builds pseudowords by
using a modified Metaheuristic algorithm based on a local search procedure
enhanced by a feedback-based scheme. Experimental results show that the new
algorithm is a practical and effective tool for nonword generation.Comment: Artificial Intelligence in Science and Technology AISAT 2004
Conference. 8 pages, 5 figures, 3 table
Monitoring COVID-19 social distancing with person detection and tracking via fine-tuned YOLO v3 and Deepsort techniques
The rampant coronavirus disease 2019 (COVID-19) has brought global crisis
with its deadly spread to more than 180 countries, and about 3,519,901
confirmed cases along with 247,630 deaths globally as on May 4, 2020. The
absence of any active therapeutic agents and the lack of immunity against
COVID-19 increases the vulnerability of the population. Since there are no
vaccines available, social distancing is the only feasible approach to fight
against this pandemic. Motivated by this notion, this article proposes a deep
learning based framework for automating the task of monitoring social
distancing using surveillance video. The proposed framework utilizes the YOLO
v3 object detection model to segregate humans from the background and Deepsort
approach to track the identified people with the help of bounding boxes and
assigned IDs. The results of the YOLO v3 model are further compared with other
popular state-of-the-art models, e.g. faster region-based CNN (convolution
neural network) and single shot detector (SSD) in terms of mean average
precision (mAP), frames per second (FPS) and loss values defined by object
classification and localization. Later, the pairwise vectorized L2 norm is
computed based on the three-dimensional feature space obtained by using the
centroid coordinates and dimensions of the bounding box. The violation index
term is proposed to quantize the non adoption of social distancing protocol.
From the experimental analysis, it is observed that the YOLO v3 with Deepsort
tracking scheme displayed best results with balanced mAP and FPS score to
monitor the social distancing in real-time
Likelihood-based semi-supervised model selection with applications to speech processing
In conventional supervised pattern recognition tasks, model selection is
typically accomplished by minimizing the classification error rate on a set of
so-called development data, subject to ground-truth labeling by human experts
or some other means. In the context of speech processing systems and other
large-scale practical applications, however, such labeled development data are
typically costly and difficult to obtain. This article proposes an alternative
semi-supervised framework for likelihood-based model selection that leverages
unlabeled data by using trained classifiers representing each model to
automatically generate putative labels. The errors that result from this
automatic labeling are shown to be amenable to results from robust statistics,
which in turn provide for minimax-optimal censored likelihood ratio tests that
recover the nonparametric sign test as a limiting case. This approach is then
validated experimentally using a state-of-the-art automatic speech recognition
system to select between candidate word pronunciations using unlabeled speech
data that only potentially contain instances of the words under test. Results
provide supporting evidence for the utility of this approach, and suggest that
it may also find use in other applications of machine learning.Comment: 11 pages, 2 figures; submitted for publicatio
- âŠ