2,881 research outputs found

    LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer

    Full text link
    Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production. Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' multimodal desires, i.e., constrained by background images and driven by foreground content. We propose LayoutDETR that inherits the high quality and realism from generative modeling, while reformulating content-aware requirements as a detection problem: we learn to detect in a background image the reasonable locations, scales, and spatial relations for multimodal foreground elements in a layout. Our solution sets a new state-of-the-art performance for layout generation on public benchmarks and on our newly-curated ad banner dataset. We integrate our solution into a graphical system that facilitates user studies, and show that users prefer our designs over baselines by significant margins. Our code, models, dataset, graphical system, and demos are available at https://github.com/salesforce/LayoutDETR

    An Enumeration of Graphical Designs

    Full text link
    Let Ψ(t,k)\Psi(t,k) denote the set of pairs (v,λ)(v,\lambda) for which there exists a graphical tt-(v,k,λ)(v,k,\lambda) design. Most results on graphical designs have gone to show the finiteness of Ψ(t,k)\Psi(t,k) when tt and kk satisfy certain conditions. The exact determination of Ψ(t,k)\Psi(t,k) for specified tt and kk is a hard problem and only Ψ(2,3)\Psi(2,3), Ψ(2,4)\Psi(2,4), Ψ(3,4)\Psi(3,4), Ψ(4,5)\Psi(4,5), and Ψ(5,6)\Psi(5,6) have been determined. In this paper, we determine completely the sets Ψ(2,5)\Psi(2,5) and Ψ(3,5)\Psi(3,5). As a result, we find more than 270000 inequivalent graphical designs, and more than 8000 new parameter sets for which there exists a graphical design. Prior to this, graphical designs are known for only 574 parameter sets.Comment: 16 page

    FPGA-Based Processor Acceleration for Image Processing Applications

    Get PDF
    FPGA-based embedded image processing systems offer considerable computing resources but present programming challenges when compared to software systems. The paper describes an approach based on an FPGA-based soft processor called Image Processing Processor (IPPro) which can operate up to 337 MHz on a high-end Xilinx FPGA family and gives details of the dataflow-based programming environment. The approach is demonstrated for a k-means clustering operation and a traffic sign recognition application, both of which have been prototyped on an Avnet Zedboard that has Xilinx Zynq-7000 system-on-chip (SoC). A number of parallel dataflow mapping options were explored giving a speed-up of 8 times for the k-means clustering using 16 IPPro cores, and a speed-up of 9.6 times for the morphology filter operation of the traffic sign recognition using 16 IPPro cores compared to their equivalent ARM-based software implementations. We show that for k-means clustering, the 16 IPPro cores implementation is 57, 28 and 1.7 times more power efficient (fps/W) than ARM Cortex-A7 CPU, nVIDIA GeForce GTX980 GPU and ARM Mali-T628 embedded GPU respectively

    Improving perception accuracy in bar charts with internal contrast and framing enhancements

    Get PDF
    Bar charts are among the most commonly used visualization graphs. Their main goal is to communicate quantities that can be visually compared. Since they are easy to produce and interpret, they are found in any situation where quantitative data needs to be conveyed (websites, newspapers, etc.). However, depending on the layout, the perceived values can vary substantially. For instance, previous research has shown that the positioning of bars (e.g. stacked vs separate) may influence the accuracy in bar ratio length estimation. Other works have studied the effects of embellishments on the perception of encoded quantities. However, to the best of the authors’ knowledge, the effect of perceptual elements used to reinforce the quantity depicted within the bars, such as contrast and inner lines, has not been studied in depth. In this research we present a study that analyzes the effect of several internal contrast and framing enhancements with respect to the use of basic solid bars. Our results show that the addition of minimal visual elements that are easy to implement with current technology can help users to better recognize the amounts depicted by the bar charts.Peer ReviewedPostprint (author's final draft

    Learning Material-Aware Local Descriptors for 3D Shapes

    Full text link
    Material understanding is critical for design, geometric modeling, and analysis of functional objects. We enable material-aware 3D shape analysis by employing a projective convolutional neural network architecture to learn material- aware descriptors from view-based representations of 3D points for point-wise material classification or material- aware retrieval. Unfortunately, only a small fraction of shapes in 3D repositories are labeled with physical mate- rials, posing a challenge for learning methods. To address this challenge, we crowdsource a dataset of 3080 3D shapes with part-wise material labels. We focus on furniture models which exhibit interesting structure and material variabil- ity. In addition, we also contribute a high-quality expert- labeled benchmark of 115 shapes from Herman-Miller and IKEA for evaluation. We further apply a mesh-aware con- ditional random field, which incorporates rotational and reflective symmetries, to smooth our local material predic- tions across neighboring surface patches. We demonstrate the effectiveness of our learned descriptors for automatic texturing, material-aware retrieval, and physical simulation. The dataset and code will be publicly available.Comment: 3DV 201

    WODIS: Water Obstacle Detection Network based on Image Segmentation for Autonomous Surface Vehicles in Maritime Environments

    Get PDF
    A reliable obstacle detection system is crucial for Autonomous Surface Vehicles (ASVs) to realise fully autonomous navigation with no need of human intervention. However, the current detection methods have particular drawbacks such as poor detection for small objects, low estimation accuracy caused by water surface reflection and a high rate of false-positive on water-sky interference. Therefore, we propose a new encoderdecoder structured deep semantic segmentation network, which is Water Obstacle Detection network based on Image Segmentation (WODIS), to solve above mentioned problems. The first design feature of WODIS utilises the use of an encoder network to extract high-level data based on different sampling rates. In order to improve obstacle detection at sea-sky-line areas, an Attention Refine Module (ARM) activated by both global average pooling and max pooling to capture high-level information has been designed and integrated into WODIS. In addition, a Feature Fusion Module (FFM) is introduced to help concatenate the multi-dimensional high-level features in the decoder network. The WODIS is tested and cross validated using four different types of maritime datasets with the results demonstrating that mIoU of WODIS can achieve superior segmentation effects for sea level obstacles to values as high as 91.3

    DEVELOPMENT OF A MODULAR AGRICULTURAL ROBOTIC SPRAYER

    Get PDF
    Precision Agriculture (PA) increases farm productivity, reduces pollution, and minimizes input costs. However, the wide adoption of existing PA technologies for complex field operations, such as spraying, is slow due to high acquisition costs, low adaptability, and slow operating speed. In this study, we designed, built, optimized, and tested a Modular Agrochemical Precision Sprayer (MAPS), a robotic sprayer with an intelligent machine vision system (MVS). Our work focused on identifying and spraying on the targeted plants with low cost, high speed, and high accuracy in a remote, dynamic, and rugged environment. We first researched and benchmarked combinations of one-stage convolutional neural network (CNN) architectures with embedded or mobile hardware systems. Our analysis revealed that TensorRT-optimized SSD-MobilenetV1 on an NVIDIA Jetson Nano provided sufficient plant detection performance with low cost and power consumption. We also developed an algorithm to determine the maximum operating velocity of a chosen CNN and hardware configuration through modeling and simulation. Based on these results, we developed a CNN-based MVS for real-time plant detection and velocity estimation. We implemented Robot Operating System (ROS) to integrate each module for easy expansion. We also developed a robust dynamic targeting algorithm to synchronize the spray operation with the robot motion, which will increase productivity significantly. The research proved to be successful. We built a MAPS with three independent vision and spray modules. In the lab test, the sprayer recognized and hit all targets with only 2% wrong sprays. In the field test with an unstructured crop layout, such as a broadcast-seeded soybean field, the MAPS also successfully sprayed all targets with only a 7% incorrect spray rate
    • …
    corecore