78 research outputs found

    PBFormer: Capturing Complex Scene Text Shape with Polynomial Band Transformer

    Full text link
    We present PBFormer, an efficient yet powerful scene text detector that unifies the transformer with a novel text shape representation Polynomial Band (PB). The representation has four polynomial curves to fit a text's top, bottom, left, and right sides, which can capture a text with a complex shape by varying polynomial coefficients. PB has appealing features compared with conventional representations: 1) It can model different curvatures with a fixed number of parameters, while polygon-points-based methods need to utilize a different number of points. 2) It can distinguish adjacent or overlapping texts as they have apparent different curve coefficients, while segmentation-based or points-based methods suffer from adhesive spatial positions. PBFormer combines the PB with the transformer, which can directly generate smooth text contours sampled from predicted curves without interpolation. A parameter-free cross-scale pixel attention (CPA) module is employed to highlight the feature map of a suitable scale while suppressing the other feature maps. The simple operation can help detect small-scale texts and is compatible with the one-stage DETR framework, where no postprocessing exists for NMS. Furthermore, PBFormer is trained with a shape-contained loss, which not only enforces the piecewise alignment between the ground truth and the predicted curves but also makes curves' positions and shapes consistent with each other. Without bells and whistles about text pre-training, our method is superior to the previous state-of-the-art text detectors on the arbitrary-shaped text datasets.Comment: 9 pages, 8 figures, accepted by ACM MM 202

    The rice ERF transcription factor OsERF922 negatively regulates resistance to Magnaporthe oryzae and salt tolerance

    Get PDF
    Rice OsERF922, encoding an APETELA2/ethylene response factor (AP2/ERF) type transcription factor, is rapidly and strongly induced by abscisic acid (ABA) and salt treatments, as well as by both virulent and avirulent pathovars of Magnaporthe oryzae, the causal agent of rice blast disease. OsERF922 is localized to the nucleus, binds specifically to the GCC box sequence, and acts as a transcriptional activator in plant cells. Knockdown of OsERF922 by means of RNAi enhanced resistance against M. oryzae. The elevated disease resistance of the RNAi plants was associated with increased expression of PR, PAL, and the other genes encoding phytoalexin biosynthetic enzymes and without M. oryzae infection. In contrast, OsERF922-overexpressing plants showed reduced expression of these defence-related genes and enhanced susceptibility to M. oryzae. In addition, the OsERF922-overexpressing lines exhibited decreased tolerance to salt stress with an increased Na+/K+ ratio in the shoots. The ABA levels were found increased in the overexpressing lines and decreased in the RNAi plants. Expression of the ABA biosynthesis-related genes, 9-cis-epoxycarotenoid dioxygenase (NCED) 3 and 4, was upregulated in the OsERF922-overexpressing plants, and NCED4 was downregulated in the RNAi lines. These results suggest that OsERF922 is integrated into the cross-talk between biotic and abiotic stress-signalling networks perhaps through modulation of the ABA levels

    Functional Source Separation for EEG-fMRI Fusion: Application to Steady-State Visual Evoked Potentials

    Get PDF
    Neurorobotics is one of the most ambitious fields in robotics, driving integration of interdisciplinary data and knowledge. One of the most productive areas of interdisciplinary research in this area has been the implementation of biologically-inspired mechanisms in the development of autonomous systems. Specifically, enabling such systems to display adaptive behavior such as learning from good and bad outcomes, has been achieved by quantifying and understanding the neural mechanisms of the brain networks mediating adaptive behaviors in humans and animals. For example, associative learning from aversive or dangerous outcomes is crucial for an autonomous system, to avoid dangerous situations in the future. A body of neuroscience research has suggested that the neurocomputations in the human brain during associative learning involve re-shaping of sensory responses. The nature of these adaptive changes in sensory processing during learning however are not yet well enough understood to be readily implemented into on-board algorithms for robotics application. Toward this overall goal, we record the simultaneous electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI), characterizing one candidate mechanism, i.e., large-scale brain oscillations. The present report examines the use of Functional Source Separation (FSS) as an optimization step in EEG-fMRI fusion that harnesses timing information to constrain the solutions that satisfy physiological assumptions. We applied this approach to the voxel-wise correlation of steady-state visual evoked potential (ssVEP) amplitude and blood oxygen level-dependent imaging (BOLD), across both time series. The results showed the benefit of FSS for the extraction of robust ssVEP signals during simultaneous EEG-fMRI recordings. Applied to data from a 3-phase aversive conditioning paradigm, the correlation maps across the three phases (habituation, acquisition, extinction) show converging results, notably major overlapping areas in both primary and extended visual cortical regions, including calcarine sulcus, lingual cortex, and cuneus. In addition, during the acquisition phase when aversive learning occurs, we observed additional correlations between ssVEP and BOLD in the anterior cingulate cortex (ACC) as well as the precuneus and superior temporal gyrus

    Reality3DSketch: Rapid 3D Modeling of Objects from Single Freehand Sketches

    Full text link
    The emerging trend of AR/VR places great demands on 3D content. However, most existing software requires expertise and is difficult for novice users to use. In this paper, we aim to create sketch-based modeling tools for user-friendly 3D modeling. We introduce Reality3DSketch with a novel application of an immersive 3D modeling experience, in which a user can capture the surrounding scene using a monocular RGB camera and can draw a single sketch of an object in the real-time reconstructed 3D scene. A 3D object is generated and placed in the desired location, enabled by our novel neural network with the input of a single sketch. Our neural network can predict the pose of a drawing and can turn a single sketch into a 3D model with view and structural awareness, which addresses the challenge of sparse sketch input and view ambiguity. We conducted extensive experiments synthetic and real-world datasets and achieved state-of-the-art (SOTA) results in both sketch view estimation and 3D modeling performance. According to our user study, our method of performing 3D modeling in a scene is >>5x faster than conventional methods. Users are also more satisfied with the generated 3D model than the results of existing methods.Comment: IEEE Transactions on MultiMedi
    corecore