65 research outputs found

    STEFANN: Scene Text Editor using Font Adaptive Neural Network

    Full text link
    Textual information in a captured scene plays an important role in scene interpretation and decision making. Though there exist methods that can successfully detect and interpret complex text regions present in a scene, to the best of our knowledge, there is no significant prior work that aims to modify the textual information in an image. The ability to edit text directly on images has several advantages including error correction, text restoration and image reusability. In this paper, we propose a method to modify text in an image at character-level. We approach the problem in two stages. At first, the unobserved character (target) is generated from an observed character (source) being modified. We propose two different neural network architectures - (a) FANnet to achieve structural consistency with source font and (b) Colornet to preserve source color. Next, we replace the source character with the generated character maintaining both geometric and visual consistency with neighboring characters. Our method works as a unified platform for modifying text in images. We present the effectiveness of our method on COCO-Text and ICDAR datasets both qualitatively and quantitatively.Comment: Accepted in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 202

    Towards Closing the Programmability-Efficiency Gap using Software-Defined Hardware

    Full text link
    The past decade has seen the breakdown of two important trends in the computing industry: Moore’s law, an observation that the number of transistors in a chip roughly doubles every eighteen months, and Dennard scaling, that enabled the use of these transistors within a constant power budget. This has caused a surge in domain-specific accelerators, i.e. specialized hardware that deliver significantly better energy efficiency than general-purpose processors, such as CPUs. While the performance and efficiency of such accelerators are highly desirable, the fast pace of algorithmic innovation and non-recurring engineering costs have deterred their widespread use, since they are only programmable across a narrow set of applications. This has engendered a programmability-efficiency gap across contemporary platforms. A practical solution that can close this gap is thus lucrative and is likely to engender broad impact in both academic research and the industry. This dissertation proposes such a solution with a reconfigurable Software-Defined Hardware (SDH) system that morphs parts of the hardware on-the-fly to tailor to the requirements of each application phase. This system is designed to deliver near-accelerator-level efficiency across a broad set of applications, while retaining CPU-like programmability. The dissertation first presents a fixed-function solution to accelerate sparse matrix multiplication, which forms the basis of many applications in graph analytics and scientific computing. The solution consists of a tiled hardware architecture, co-designed with the outer product algorithm for Sparse Matrix-Matrix multiplication (SpMM), that uses on-chip memory reconfiguration to accelerate each phase of the algorithm. A proof-of-concept is then presented in the form of a prototyped 40 nm Complimentary Metal-Oxide Semiconductor (CMOS) chip that demonstrates energy efficiency and performance per die area improvements of 12.6x and 17.1x over a high-end CPU, and serves as a stepping stone towards a full SDH system. The next piece of the dissertation enhances the proposed hardware with reconfigurability of the dataflow and resource sharing modes, in order to extend acceleration support to a set of common parallelizable workloads. This reconfigurability lends the system the ability to cater to discrete data access and compute patterns, such as workloads with extensive data sharing and reuse, workloads with limited reuse and streaming access patterns, among others. Moreover, this system incorporates commercial cores and a prototyped software stack for CPU-level programmability. The proposed system is evaluated on a diverse set of compute-bound and memory-bound kernels that compose applications in the domains of graph analytics, machine learning, image and language processing. The evaluation shows average performance and energy-efficiency gains of 5.0x and 18.4x over the CPU. The final part of the dissertation proposes a runtime control framework that uses low-cost monitoring of hardware performance counters to predict the next best configuration and reconfigure the hardware, upon detecting a change in phase or nature of data within the application. In comparison to prior work, this contribution targets multicore CGRAs, uses low-overhead decision tree based predictive models, and incorporates reconfiguration cost-awareness into its policies. Compared to the best-average static (non-reconfiguring) configuration, the dynamically reconfigurable system achieves a 1.6x improvement in performance-per-Watt in the Energy-Efficient mode of operation, or the same performance with 23% lower energy in the Power-Performance mode, for SpMM across a suite of real-world inputs. The proposed reconfiguration mechanism itself outperforms the state-of-the-art approach for dynamic runtime control by up to 2.9x in terms of energy-efficiency.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169859/1/subh_1.pd

    A CNN Based Framework for Unistroke Numeral Recognition in Air-Writing

    Full text link
    Air-writing refers to virtually writing linguistic characters through hand gestures in three-dimensional space with six degrees of freedom. This paper proposes a generic video camera-aided convolutional neural network (CNN) based air-writing framework. Gestures are performed using a marker of fixed color in front of a generic video camera, followed by color-based segmentation to identify the marker and track the trajectory of the marker tip. A pre-trained CNN is then used to classify the gesture. The recognition accuracy is further improved using transfer learning with the newly acquired data. The performance of the system varies significantly on the illumination condition due to color-based segmentation. In a less fluctuating illumination condition, the system is able to recognize isolated unistroke numerals of multiple languages. The proposed framework has achieved 97.7%, 95.4% and 93.7% recognition rates in person independent evaluations on English, Bengali and Devanagari numerals, respectively.Comment: Accepted in The International Conference on Frontiers of Handwriting Recognition (ICFHR) 201

    Effects of Degradations on Deep Neural Network Architectures

    Full text link
    Recently, image classification methods based on capsules (groups of neurons) and a novel dynamic routing protocol are proposed. The methods show promising performances than the state-of-the-art CNN-based models in some of the existing datasets. However, the behavior of capsule-based models and CNN-based models are largely unknown in presence of noise. So it is important to study the performance of these models under various noises. In this paper, we demonstrate the effect of image degradations on deep neural network architectures for image classification task. We select six widely used CNN architectures to analyse their performances for image classification task on datasets of various distortions. Our work has three main contributions: 1) we observe the effects of degradations on different CNN models; 2) accordingly, we propose a network setup that can enhance the robustness of any CNN architecture for certain degradations, and 3) we propose a new capsule network that achieves high recognition accuracy. To the best of our knowledge, this is the first study on the performance of CapsuleNet (CapsNet) and other state-of-the-art CNN architectures under different types of image degradations. Also, our datasets and source code are available publicly to the researchers.Comment: Journa

    Performance Analysis of Three Routing Protocols in MANET Using the NS-2 and ANOVA Test with Varying Speed of Nodes

    Get PDF
    In this chapter, we analyzed ad hoc on demand distance vector (AODV), dynamic source routing (DSR), and destination-sequenced distance vector (DSDV) routing protocols using different parameters of QoS metrics such as packet delivery ratio (PDR), normalize routing overhead, throughput, and jitter. The aim of this chapter is to determine a difference between routing protocol performance when operating in a large-area MANET with high-speed mobile nodes. After the simulations, we use AWK to analyze the data and then Xgraph to plot the performance metric. After that we use one-way ANOVA tools to confirm the correctness of the result. We use NS-2 for the simulation work. The comparison analysis of these protocols will be carrying out and in the last, we conclude that which routing protocol is the best one for mobile ad hoc networks

    On-spot biosensing device for organophosphate pesticide residue detection in fruits and vegetables

    Get PDF
    Acetylcholinesterase (AChE), a widely used enzyme for inhibition-based biosensors in pesticide residues detection, lags due to multiple-step operation, time-consuming incubation and reactivation/regeneration steps. Herein, this endeavour reports the development of Organophosphate Hydrolase (OPH), which has functional superiority over the AChE and explored in on-spot biosensing device for organophosphate pesticide residue detection in fruits and vegetables. The organophosphate degrading enzyme OPH is expressed from the ‘opd’ gene through biotechnological tools. The OPH exhibited its best activity at pH 8.0 and subsequently thermal inactivation over 37 °C. The activity of the purified OPH enzyme was found 2.75 U mL−1 at λmax 410 nm. Furthermore, the developed OPH is integrated into 96 well plate format with our previously reported UIISScan 1.1, an advanced imaging array technology based field-portable high-throughput sensory system. The developed biosensor revealed a linear range from 100 ng mL−1 to 0.1 ng mL−1 for detection of organophosphate pesticide residues with a negative slope i.e. y = 235.678x (ng mL−1) – 62.8725 with R2 = 0.99991 and n = 23. Moreover, the applicability of the developed biosensor was tested for market available fruits and vegetables. This is the first-ever reported OPH mediated on-spot biosensing device for pesticide residue detection in fruits and vegetables to the best of our knowledge. © 2021 The Author(s
    corecore