26 research outputs found
EnsNet: Ensconce Text in the Wild
A new method is proposed for removing text from natural images. The challenge
is to first accurately localize text on the stroke-level and then replace it
with a visually plausible background. Unlike previous methods that require
image patches to erase scene text, our method, namely ensconce network
(EnsNet), can operate end-to-end on a single image without any prior knowledge.
The overall structure is an end-to-end trainable FCN-ResNet-18 network with a
conditional generative adversarial network (cGAN). The feature of the former is
first enhanced by a novel lateral connection structure and then refined by four
carefully designed losses: multiscale regression loss and content loss, which
capture the global discrepancy of different level features; texture loss and
total variation loss, which primarily target filling the text region and
preserving the reality of the background. The latter is a novel local-sensitive
GAN, which attentively assesses the local consistency of the text erased
regions. Both qualitative and quantitative sensitivity experiments on synthetic
images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet
is essential to achieve a good performance. Moreover, our EnsNet can
significantly outperform previous state-of-the-art methods in terms of all
metrics. In addition, a qualitative experiment conducted on the SMBNet dataset
further demonstrates that the proposed method can also preform well on general
object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can
preform at 333 fps on an i5-8600 CPU device.Comment: 8 pages, 8 figures, 2 tables, accepted to appear in AAAI 201
Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution
Visual information extraction (VIE) has attracted considerable attention
recently owing to its various advanced applications such as document
understanding, automatic marking and intelligent education. Most existing works
decoupled this problem into several independent sub-tasks of text spotting
(text detection and recognition) and information extraction, which completely
ignored the high correlation among them during optimization. In this paper, we
propose a robust visual information extraction system (VIES) towards real-world
scenarios, which is a unified end-to-end trainable framework for simultaneous
text detection, recognition and information extraction by taking a single
document image as input and outputting the structured information.
Specifically, the information extraction branch collects abundant visual and
semantic representations from text spotting for multimodal feature fusion and
conversely, provides higher-level semantic clues to contribute to the
optimization of text spotting. Moreover, regarding the shortage of public
benchmarks, we construct a fully-annotated dataset called EPHOIE
(https://github.com/HCIILAB/EPHOIE), which is the first Chinese benchmark for
both text spotting and visual information extraction. EPHOIE consists of 1,494
images of examination paper head with complex layouts and background, including
a total of 15,771 Chinese handwritten or printed text instances. Compared with
the state-of-the-art methods, our VIES shows significant superior performance
on the EPHOIE dataset and achieves a 9.01% F-score gain on the widely used
SROIE dataset under the end-to-end scenario.Comment: 8 pages, 5 figures, to be published in AAAI 202
Study on Overburden Fracture and Structural Distribution Evolution Characteristics of Coal Seam Mining in Deep Large Mining Height Working Face
Coal mining has gradually entered the deep mining era, and large-height mining is an important way to mine thick coal seams in the deep. The high coal wall will inevitably make the distribution of the overburden structure in the coal mining face more complicated, and the large buried depth will also cause more intense mine pressure. The study of the distribution and evolution of the overburden structure and stress in the mining site can provide theoretical guidance for safe mining. In this work, a physical similarity modeling test was carried out based on the physical–mechanical parameters of overburden rock and similarity theory, taking the mining of a deep, large-height working face in Pingdingshan Coal Mine as an example. The results show that the deformation and breakage of overburden rock in deep, large-height workings occurring during mining is persistent and not only in a short period of time. The breakage form of overburden can be categorized into two types based on the deformation characteristics: (I) non-separation-induced type, and (II) separation-induced type. Among these, the breakage induced by separation can be divided into two categories: (i) dominated by self-weight stress, and (ii) affected by shear cracks. It also summarizes the form of the overburden structure and the structural morphology of the stope. The overburden structure shows a “combined cantilever beam structure-articulated rock-slab structure-non-articulated rock-slab structure”. Among these, the periodic breakage of the upper cantilever beam evolved articulated and non-articulated rock-slab structure in the lower part, which weakened the supporting effect of the lower gangue and further aggravated the breakage of the upper overburden rock. The shape of the main structure of the stope mainly depends on the fracture line from the advancing coal wall to the upper overburden: from a rectangular shape without collapse to a trapezoidal shape at the initial stage of collapse, to a trapezoidal shape with multiple steps after the main roof collapse
Modeling and Optimization of Impedance Balancing Technique for Common Mode Noise Attenuation in DC-DC Boost Converters
As an effective means of suppressing electromagnetic interference (EMI) noise, the impedance balancing technique has been adopted in the literature. By suppressing the noise source, this technique can theoretically reduce the noise to zero. Nevertheless, its effect is limited in practice and also suffers from noise spikes. Therefore, this paper introduces an accurate frequency modeling method to investigate the attenuation degree of noise source and redesign the impedance selection accordingly in order to improve the noise reduction capability. Based on a conventional boost converter, the common mode (CM) noise model was built by identifying the noise source and propagation paths at first. Then the noise source model was extracted through capturing the switching voltage waveform in time domain and then calculating its Fourier series in frequency domain. After that, the conventional boost converter was modified with the known impedance balancing techniques. This balanced circuit was analyzed with the introduced modeling method, and the equivalent noise source was precisely estimated by combining the noise spectra and impedance information. Furthermore, two optimized schemes with redesigned impedances were proposed to deal with the resonance problem. A hardware circuit was designed and built to experimentally validate the proposed concepts. The experimental results demonstrate the feasibility and effectiveness of the proposed schemes