4,776 research outputs found
Threshold Regression for Survival Analysis: Modeling Event Times by a Stochastic Process Reaching a Boundary
Many researchers have investigated first hitting times as models for survival
data. First hitting times arise naturally in many types of stochastic
processes, ranging from Wiener processes to Markov chains. In a survival
context, the state of the underlying process represents the strength of an item
or the health of an individual. The item fails or the individual experiences a
clinical endpoint when the process reaches an adverse threshold state for the
first time. The time scale can be calendar time or some other operational
measure of degradation or disease progression. In many applications, the
process is latent (i.e., unobservable). Threshold regression refers to
first-hitting-time models with regression structures that accommodate covariate
data. The parameters of the process, threshold state and time scale may depend
on the covariates. This paper reviews aspects of this topic and discusses
fruitful avenues for future research.Comment: Published at http://dx.doi.org/10.1214/088342306000000330 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Jointly Modeling Embedding and Translation to Bridge Video and Language
Automatically describing video content with natural language is a fundamental
challenge of multimedia. Recurrent Neural Networks (RNN), which models sequence
dynamics, has attracted increasing attention on visual interpretation. However,
most existing approaches generate a word locally with given previous words and
the visual content, while the relationship between sentence semantics and
visual content is not holistically exploited. As a result, the generated
sentences may be contextually correct but the semantics (e.g., subjects, verbs
or objects) are not true.
This paper presents a novel unified framework, named Long Short-Term Memory
with visual-semantic Embedding (LSTM-E), which can simultaneously explore the
learning of LSTM and visual-semantic embedding. The former aims to locally
maximize the probability of generating the next word given previous words and
visual content, while the latter is to create a visual-semantic embedding space
for enforcing the relationship between the semantics of the entire sentence and
visual content. Our proposed LSTM-E consists of three components: a 2-D and/or
3-D deep convolutional neural networks for learning powerful video
representation, a deep RNN for generating sentences, and a joint embedding
model for exploring the relationships between visual content and sentence
semantics. The experiments on YouTube2Text dataset show that our proposed
LSTM-E achieves to-date the best reported performance in generating natural
sentences: 45.3% and 31.0% in terms of BLEU@4 and METEOR, respectively. We also
demonstrate that LSTM-E is superior in predicting Subject-Verb-Object (SVO)
triplets to several state-of-the-art techniques
- …