280 research outputs found
Deep Multimodal Speaker Naming
Automatic speaker naming is the problem of localizing as well as identifying
each speaking character in a TV/movie/live show video. This is a challenging
problem mainly attributes to its multimodal nature, namely face cue alone is
insufficient to achieve good performance. Previous multimodal approaches to
this problem usually process the data of different modalities individually and
merge them using handcrafted heuristics. Such approaches work well for simple
scenes, but fail to achieve high performance for speakers with large appearance
variations. In this paper, we propose a novel convolutional neural networks
(CNN) based learning framework to automatically learn the fusion function of
both face and audio cues. We show that without using face tracking, facial
landmark localization or subtitle/transcript, our system with robust multimodal
feature extraction is able to achieve state-of-the-art speaker naming
performance evaluated on two diverse TV series. The dataset and implementation
of our algorithm are publicly available online
Look, Listen and Learn - A Multimodal LSTM for Speaker Identification
Speaker identification refers to the task of localizing the face of a person
who has the same identity as the ongoing voice in a video. This task not only
requires collective perception over both visual and auditory signals, the
robustness to handle severe quality degradations and unconstrained content
variations are also indispensable. In this paper, we describe a novel
multimodal Long Short-Term Memory (LSTM) architecture which seamlessly unifies
both visual and auditory modalities from the beginning of each sequence input.
The key idea is to extend the conventional LSTM by not only sharing weights
across time steps, but also sharing weights across modalities. We show that
modeling the temporal dependency across face and voice can significantly
improve the robustness to content quality degradations and variations. We also
found that our multimodal LSTM is robustness to distractors, namely the
non-speaking identities. We applied our multimodal LSTM to The Big Bang Theory
dataset and showed that our system outperforms the state-of-the-art systems in
speaker identification with lower false alarm rate and higher recognition
accuracy.Comment: The 30th AAAI Conference on Artificial Intelligence (AAAI-16
Bias-correction and Test for Mark-point Dependence with Replicated Marked Point Processes
Mark-point dependence plays a critical role in research problems that can be
fitted into the general framework of marked point processes. In this work, we
focus on adjusting for mark-point dependence when estimating the mean and
covariance functions of the mark process, given independent replicates of the
marked point process. We assume that the mark process is a Gaussian process and
the point process is a log-Gaussian Cox process, where the mark-point
dependence is generated through the dependence between two latent Gaussian
processes. Under this framework, naive local linear estimators ignoring the
mark-point dependence can be severely biased. We show that this bias can be
corrected using a local linear estimator of the cross-covariance function and
establish uniform convergence rates of the bias-corrected estimators.
Furthermore, we propose a test statistic based on local linear estimators for
mark-point independence, which is shown to converge to an asymptotic normal
distribution in a parametric -convergence rate. Model diagnostics
tools are developed for key model assumptions and a robust functional
permutation test is proposed for a more general class of mark-point processes.
The effectiveness of the proposed methods is demonstrated using extensive
simulations and applications to two real data examples
Group Network Hawkes Process
In this work, we study the event occurrences of individuals interacting in a
network. To characterize the dynamic interactions among the individuals, we
propose a group network Hawkes process (GNHP) model whose network structure is
observed and fixed. In particular, we introduce a latent group structure among
individuals to account for the heterogeneous user-specific characteristics. A
maximum likelihood approach is proposed to simultaneously cluster individuals
in the network and estimate model parameters. A fast EM algorithm is
subsequently developed by utilizing the branching representation of the
proposed GNHP model. Theoretical properties of the resulting estimators of
group memberships and model parameters are investigated under both settings
when the number of latent groups is over-specified or correctly specified.
A data-driven criterion that can consistently identify the true under mild
conditions is derived. Extensive simulation studies and an application to a
data set collected from Sina Weibo are used to illustrate the effectiveness of
the proposed methodology.Comment: 35 page
Second order semi-parametric inference for multivariate log Gaussian Cox processes
This paper introduces a new approach to inferring the second order properties
of a multivariate log Gaussian Cox process (LGCP) with a complex intensity
function. We assume a semi-parametric model for the multivariate intensity
function containing an unspecified complex factor common to all types of
points. Given this model we exploit the availability of several types of points
to construct a second-order conditional composite likelihood to infer the pair
correlation and cross pair correlation functions of the LGCP. Crucially this
likelihood does not depend on the unspecified part of the intensity function.
We also introduce a cross validation method for model selection and an
algorithm for regularized inference that can be used to obtain sparse models
for cross pair correlation functions. The methodology is applied to simulated
data as well as data examples from microscopy and criminology. This shows how
the new approach outperforms existing alternatives where the intensity
functions are estimated non-parametrically.Comment: 32 pages including appendi
Reducing series resistance in Cu2ZnSn(S,Se)4 nanoparticle ink solar cells on flexible molybdenum foil substrates
Earth abundant Cu2ZnSnS4 nanoparticle inks were depostied on molybdenum foil substrates and subsequently converted to high quality thin film Cu2ZnSn(S,Se)4 photovoltaic absorbers. Integration of these absorbers within a thin film solar cell device structure yields a solar energy conversion efficiency which is comparable to identical devices processed on rigid glass substrates. Importantly, this is only achieved when a thin layer of molybdenum is first applied directly to the foil. The layer limits the formation of a thick Mo(S,Se)x layer resulting in a substantially reduced series resistance
XGBoostPP:Tree-based Estimation of Point Process Intensity Functions
We propose a novel tree-based ensemble method, named XGBoostPP, to nonparametrically estimate the intensity of a point process as a function of covariates. It extends the use of gradient-boosted regression trees (Chen & Guestrin, 2016) to the point process literature via two carefully designed loss functions. The first loss is based on the Poisson likelihood, working for general point processes. The second loss is based on the weighted Poisson likelihood, where spatially dependent weights are introduced to further improve the estimation efficiency for clustered processes. An efficient greedy search algorithm is developed for model estimation, and the effectiveness of the proposed method is demonstrated through extensive simulation studies and two real data analyses. In particular, we report that XGBoostPP achieves superior performance to existing approaches when the dimension of the covariate space is high, revealing the advantages of tree-based ensemble methods in estimating complex intensity function
Association of tissue lineage and gene expression: conservatively and differentially expressed genes define common and special functions of tissues
<p>Abstract</p> <p>Background</p> <p>Embryogenesis is the process by which the embryo is formed, develops, and establishes developmental hierarchies of tissues. The recent advance in microarray technology made it possible to investigate the tissue specific patterns of gene expression and their relationship with tissue lineages. This study is focused on how tissue specific functions, tissue lineage, and cell differentiation are correlated, which is essential to understand embryonic development and organism complexity.</p> <p>Results</p> <p>We performed individual gene and gene set based analysis on multiple tissue expression data, in association with the classic topology of mammalian fate maps of embryogenesis. For each sub-group of tissues on the fate map, conservatively, differentially and correlatively expressed genes or gene sets were identified. Tissue distance was found to correlate with gene expression divergence. Tissues of the ectoderm or mesoderm origins from the same segments on the fate map shared more similar expression pattern than those from different origins. Conservatively expressed genes or gene sets define common functions in a tissue group and are related to tissue specific diseases, which is supported by results from Gene Ontology and KEGG pathway analysis. Gene expression divergence is larger in certain human tissues than in the mouse homologous tissues.</p> <p>Conclusion</p> <p>The results from tissue lineage and gene expression analysis indicate that common function features of neighbor tissue groups were defined by the conservatively expressed genes and were related to tissue specific diseases, and differentially expressed genes contribute to the functional divergence of tissues. The difference of gene expression divergence in human and mouse homologous tissues reflected the organism complexity, i.e. distinct neural development levels and different body sizes.</p
Modeling Social Media User Content Generation Using Interpretable Point Process Models
In this article, we study the activity patterns of modern social media users
on platforms such as Twitter and Facebook. To characterize the complex patterns
we observe in users' interactions with social media, we describe a new class of
point process models. The components in the model have straightforward
interpretations and can thus provide meaningful insights into user activity
patterns. A composite likelihood approach and a composite EM estimation
procedure are developed to overcome the challenges that arise in parameter
estimation. Using the proposed method, we analyze Donald Trump's Twitter data
and study if and how his tweeting behavior evolved before, during and after the
presidential campaign. Additionally, we analyze a large-scale social media data
from Sina Weibo and identify interesting groups of users with distinct
behaviors; in this analysis, we also discuss the effect of social ties on a
user's online content generating behavior
- …