43,747 research outputs found
PersonRank: Detecting Important People in Images
Always, some individuals in images are more important/attractive than others
in some events such as presentation, basketball game or speech. However, it is
challenging to find important people among all individuals in images directly
based on their spatial or appearance information due to the existence of
diverse variations of pose, action, appearance of persons and various changes
of occasions. We overcome this difficulty by constructing a multiple
Hyper-Interaction Graph to treat each individual in an image as a node and
inferring the most active node referring to interactions estimated by various
types of clews. We model pairwise interactions between persons as the edge
message communicated between nodes, resulting in a bidirectional
pairwise-interaction graph. To enrich the personperson interaction estimation,
we further introduce a unidirectional hyper-interaction graph that models the
consensus of interaction between a focal person and any person in a local
region around. Finally, we modify the PageRank algorithm to infer the
activeness of persons on the multiple Hybrid-Interaction Graph (HIG), the union
of the pairwise-interaction and hyperinteraction graphs, and we call our
algorithm the PersonRank. In order to provide publicable datasets for
evaluation, we have contributed a new dataset called Multi-scene Important
People Image Dataset and gathered a NCAA Basketball Image Dataset from sports
game sequences. We have demonstrated that the proposed PersonRank outperforms
related methods clearly and substantially.Comment: 8 pages, conferenc
Joint Uncertainty Decoding with Unscented Transform for Noise Robust Subspace Gaussian Mixture Models
Common noise compensation techniques use vector Taylor series (VTS) to approximate the mismatch function. Recent work shows that the approximation accuracy may be improved by sampling. One such sampling technique is the unscented transform (UT), which draws samples deterministically from clean speech and noise model to derive the noise corrupted speech parameters. This paper applies UT to noise compensation of the subspace Gaussian mixture model (SGMM). Since UT requires relatively smaller number of samples for accurate estimation, it has significantly lower computational cost compared to other random sampling techniques. However, the number of surface Gaussians in an SGMM is typically very large, making the direct application of UT, for compensating individual Gaussian components, computationally impractical. In this paper, we avoid the computational burden by employing UT in the framework of joint uncertainty decoding (JUD), which groups all the Gaussian components into small number of classes, sharing the compensation parameters by class. We evaluate the JUD-UT technique for an SGMM system using the Aurora 4 corpus. Experimental results indicate that UT can lead to increased accuracy compared to VTS approximation if the JUD phase factor is untuned, and to similar accuracy if the phase factor is tuned empirically. 1
Learning Dynamic Feature Selection for Fast Sequential Prediction
We present paired learning and inference algorithms for significantly
reducing computation and increasing speed of the vector dot products in the
classifiers that are at the heart of many NLP components. This is accomplished
by partitioning the features into a sequence of templates which are ordered
such that high confidence can often be reached using only a small fraction of
all features. Parameter estimation is arranged to maximize accuracy and early
confidence in this sequence. Our approach is simpler and better suited to NLP
than other related cascade methods. We present experiments in left-to-right
part-of-speech tagging, named entity recognition, and transition-based
dependency parsing. On the typical benchmarking datasets we can preserve POS
tagging accuracy above 97% and parsing LAS above 88.5% both with over a
five-fold reduction in run-time, and NER F1 above 88 with more than 2x increase
in speed.Comment: Appears in The 53rd Annual Meeting of the Association for
Computational Linguistics, Beijing, China, July 201
Scientific Information Extraction with Semi-supervised Neural Tagging
This paper addresses the problem of extracting keyphrases from scientific
articles and categorizing them as corresponding to a task, process, or
material. We cast the problem as sequence tagging and introduce semi-supervised
methods to a neural tagging model, which builds on recent advances in named
entity recognition. Since annotated training data is scarce in this domain, we
introduce a graph-based semi-supervised algorithm together with a data
selection scheme to leverage unannotated articles. Both inductive and
transductive semi-supervised learning strategies outperform state-of-the-art
information extraction performance on the 2017 SemEval Task 10 ScienceIE task.Comment: accepted by EMNLP 201
- …