22,317 research outputs found
Towards Robust Curve Text Detection with Conditional Spatial Expansion
It is challenging to detect curve texts due to their irregular shapes and
varying sizes. In this paper, we first investigate the deficiency of the
existing curve detection methods and then propose a novel Conditional Spatial
Expansion (CSE) mechanism to improve the performance of curve text detection.
Instead of regarding the curve text detection as a polygon regression or a
segmentation problem, we treat it as a region expansion process. Our CSE starts
with a seed arbitrarily initialized within a text region and progressively
merges neighborhood regions based on the extracted local features by a CNN and
contextual information of merged regions. The CSE is highly parameterized and
can be seamlessly integrated into existing object detection frameworks.
Enhanced by the data-dependent CSE mechanism, our curve text detection system
provides robust instance-level text region extraction with minimal
post-processing. The analysis experiment shows that our CSE can handle texts
with various shapes, sizes, and orientations, and can effectively suppress the
false-positives coming from text-like textures or unexpected texts included in
the same RoI. Compared with the existing curve text detection algorithms, our
method is more robust and enjoys a simpler processing flow. It also creates a
new state-of-art performance on curve text benchmarks with F-score of up to
78.4.Comment: This paper has been accepted by IEEE International Conference on
Computer Vision and Pattern Recognition (CVPR 2019
MLPerf Inference Benchmark
Machine-learning (ML) hardware and software system demand is burgeoning.
Driven by ML applications, the number of different ML inference systems has
exploded. Over 100 organizations are building ML inference chips, and the
systems that incorporate existing models span at least three orders of
magnitude in power consumption and five orders of magnitude in performance;
they range from embedded devices to data-center solutions. Fueling the hardware
are a dozen or more software frameworks and libraries. The myriad combinations
of ML hardware and ML software make assessing ML-system performance in an
architecture-neutral, representative, and reproducible manner challenging.
There is a clear need for industry-wide standard ML benchmarking and evaluation
criteria. MLPerf Inference answers that call. In this paper, we present our
benchmarking method for evaluating ML inference systems. Driven by more than 30
organizations as well as more than 200 ML engineers and practitioners, MLPerf
prescribes a set of rules and best practices to ensure comparability across
systems with wildly differing architectures. The first call for submissions
garnered more than 600 reproducible inference-performance measurements from 14
organizations, representing over 30 systems that showcase a wide range of
capabilities. The submissions attest to the benchmark's flexibility and
adaptability.Comment: ISCA 202
Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams
In this work, we introduce a new algorithm for analyzing a diagram, which
contains visual and textual information in an abstract and integrated way.
Whereas diagrams contain richer information compared with individual
image-based or language-based data, proper solutions for automatically
understanding them have not been proposed due to their innate characteristics
of multi-modality and arbitrariness of layouts. To tackle this problem, we
propose a unified diagram-parsing network for generating knowledge from
diagrams based on an object detector and a recurrent neural network designed
for a graphical structure. Specifically, we propose a dynamic graph-generation
network that is based on dynamic memory and graph theory. We explore the
dynamics of information in a diagram with activation of gates in gated
recurrent unit (GRU) cells. On publicly available diagram datasets, our model
demonstrates a state-of-the-art result that outperforms other baselines.
Moreover, further experiments on question answering shows potentials of the
proposed method for various applications
- …