764 research outputs found

    Structure from Motion with Higher-level Environment Representations

    Get PDF
    Computer vision is an important area focusing on understanding, extracting and using the information from vision-based sensor. It has many applications such as vision-based 3D reconstruction, simultaneous localization and mapping(SLAM) and data-driven understanding of the real world. Vision is a fundamental sensing modality in many different fields of application. While the traditional structure from motion mostly uses sparse point-based feature, this thesis aims to explore the possibility of using higher order feature representation. It starts with a joint work which uses straight line for feature representation and performs bundle adjustment with straight line parameterization. Then, we further try an even higher order representation where we use Bezier spline for parameterization. We start with a simple case where all contours are lying on the plane and uses Bezier splines to parametrize the curves in the background and optimize on both camera position and Bezier splines. For application, we present a complete end-to-end pipeline which produces meaningful dense 3D models from natural data of a 3D object: the target object is placed on a structured but unknown planar background that is modeled with splines. The data is captured using only a hand-held monocular camera. However, this application is limited to a planar scenario and we manage to push the parameterizations into real 3D. Following the potential of this idea, we introduce a more flexible higher-order extension of points that provide a general model for structural edges in the environment, no matter if straight or curved. Our model relies on linked B´ezier curves, the geometric intuition of which proves great benefits during parameter initialization and regularization. We present the first fully automatic pipeline that is able to generate spline-based representations without any human supervision. Besides a full graphical formulation of the problem, we introduce both geometric and photometric cues as well as higher-level concepts such overall curve visibility and viewing angle restrictions to automatically manage the correspondences in the graph. Results prove that curve-based structure from motion with splines is able to outperform state-of-the-art sparse feature-based methods, as well as to model curved edges in the environment

    A complete hand-drawn sketch vectorization framework

    Full text link
    Vectorizing hand-drawn sketches is a challenging task, which is of paramount importance for creating CAD vectorized versions for the fashion and creative workflows. This paper proposes a complete framework that automatically transforms noisy and complex hand-drawn sketches with different stroke types in a precise, reliable and highly-simplified vectorized model. The proposed framework includes a novel line extraction algorithm based on a multi-resolution application of Pearson's cross correlation and a new unbiased thinning algorithm that can get rid of scribbles and variable-width strokes to obtain clean 1-pixel lines. Other contributions include variants of pruning, merging and edge linking procedures to post-process the obtained paths. Finally, a modification of the original Schneider's vectorization algorithm is designed to obtain fewer control points in the resulting Bezier splines. All the proposed steps of the framework have been extensively tested and compared with state-of-the-art algorithms, showing (both qualitatively and quantitatively) its outperformance

    تمثيل الإطار الخارجي للكلمات العربية بكفاءة من خلال الدمج بين نموذج الكنتور النشط وتحديد ونقاط الزوايا

    Get PDF
    Graphical curves and surfaces fitting are hot areas of research studies and application, such as artistic applications, analysis applications and encoding purposes. Outline capture of digital word images is important in most of the desktop publishing systems. The shapes of the characters are stored in the computer memory in terms of their outlines, and the outlines are expressed as Bezier curves. Existing methods for Arabic font outline description suffer from low fitting accuracy and efficiency. In our research, we developed a new method for outlining shapes using Bezier curves with minimal set of curve points. A distinguishing characteristic of our method is that it combines the active contour method (snake) with corner detection to achieve an initial set of points that is as close to the shape's boundaries as possible. The method links these points (snake + corner) into a compound Bezier curve, and iteratively improves the fitting of the curve over the actual boundaries of the shape. We implemented and tested our method using MATLAB. Test cases included various levels of shape complexity varying from simple, moderate, and high complexity depending on factors, such as: boundary concavities, number of corners. Results show that our method achieved average 86% of accuracy when measured relative to true shape boundary. When compared to other similar methods (Masood & Sarfraz, 2009; Sarfraz & Khan, 2002; Ferdous A Sohel, Karmakar, Dooley, & Bennamoun, 2010), our method performed comparatively well. Keywords: Bezier curves, shape descriptor, curvature, corner points, control points, Active Contour Model.تعتبر المنحنيات والأسطح الرسومية موضوعاً هاماً في الدراسات البحثية وفي التطبيقات البرمجية مثل التطبيقات الفنية، وتطبيقات تحليل وترميز البيانات. ويعتبر تخطيط الحدود الخارجية للكلمات عملية أساسية في غالبية تطبيقات النشر المكتبي. في هذه التطبيقات تخزن أشكال الأحرف في الذاكرة من حيث خطوطها الخارجية، وتمثل الخطوط الخارجية على هيئة منحنيات Bezier. الطرق المستخدمة حالياً لتحديد الخطوط الخارجية للكلمات العربية تنقصها دقة وكفاءة الملاءمة ما بين الحدود الحقيقية والمنحنى الرسومي الذي تقوم بتشكيله. في هذا البحث قمنا بتطوير طريقة جديدة لتخطيط الحدود الخارجية للكلمات تعتمد على منحنيات Bezier بمجموعة أقل من المنحنيات الجزئية. تتميز طريقتنا بخاصية مميزة وهي الدمج بين آلية لاستشعار الزوايا مع آلية نموذج الكنتور النشط (الأفعى). يتم الدمج بين نقاط الزوايا ونقاط الأفعى لتشكيل مجموعة موحدة من النقاط المبدئية قريبة قدر الإمكان من الحدود الحقيقية للشكل المراد تحديده. يتشكل منحنى Bezier من هذه المجموعة المدمجة، وتتم عملية تدريجية على دورات لملاءمة المنحنى على الحدود الحقيقية للشكل. قام الباحث بتنفيذ وتجربة الطريقة الجديدة باستخدام برنامج MATLAB. وتم اختيار أشكال رسومية كعينات اختبار تتصف بمستويات متباينة من التعقيد تتراوح ما بين بسيط إلى متوسط إلى عالي التعقيد على أساس عوامل مثل تقعرات الحدود، عدد نقاط الزوايا، الفتحات الداخلية، إلخ. وقد أظهرت نتائج الاختبار أن طريقتنا الجديدة حققت دقة في الملائمة تصل نسبتها إلى 86% مقارنة بالحدود الحقيقية للشكل المستهدف. وكذلك فقد كان أداء طريقتنا جيداً بالمقارنة مع طرق أخرى مماثلة

    DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

    Full text link
    End-to-end text spotting aims to integrate scene text detection and recognition into a unified framework. Dealing with the relationship between the two sub-tasks plays a pivotal role in designing effective spotters. Although transformer-based methods eliminate the heuristic post-processing, they still suffer from the synergy issue between the sub-tasks and low training efficiency. In this paper, we present DeepSolo, a simple detection transformer baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously. Technically, for each text instance, we represent the character sequence as ordered points and model them with learnable explicit point queries. After passing a single decoder, the point queries have encoded requisite text semantics and locations and thus can be further decoded to the center line, boundary, script, and confidence of text via very simple prediction heads in parallel, solving the sub-tasks in text spotting in a unified framework. Besides, we also introduce a text-matching criterion to deliver more accurate supervisory signals, thus enabling more efficient training. Quantitative experiments on public benchmarks demonstrate that DeepSolo outperforms previous state-of-the-art methods and achieves better training efficiency. In addition, DeepSolo is also compatible with line annotations, which require much less annotation cost than polygons. The code will be released.Comment: The code will be available at https://github.com/ViTAE-Transformer/DeepSol

    A variational model for data fitting on manifolds by minimizing the acceleration of a B\'ezier curve

    Get PDF
    We derive a variational model to fit a composite B\'ezier curve to a set of data points on a Riemannian manifold. The resulting curve is obtained in such a way that its mean squared acceleration is minimal in addition to remaining close the data points. We approximate the acceleration by discretizing the squared second order derivative along the curve. We derive a closed-form, numerically stable and efficient algorithm to compute the gradient of a B\'ezier curve on manifolds with respect to its control points, expressed as a concatenation of so-called adjoint Jacobi fields. Several examples illustrate the capabilites and validity of this approach both for interpolation and approximation. The examples also illustrate that the approach outperforms previous works tackling this problem

    Feature-Based Textures

    Get PDF
    This paper introduces feature-based textures, a new image representation that combines features and texture samples for high-quality texture mapping. Features identify boundaries within a texture where samples change discontinuously. They can be extracted from vector graphics representations, or explicity added to raster images to improve sharpness. Texture lookups are then interpolated from samples while respecting these boundaries. We present results from a software implementation of this technique demonstrating quality, efficiency and low memory overhead

    Deep Forward and Inverse Perceptual Models for Tracking and Prediction

    Full text link
    We consider the problems of learning forward models that map state to high-dimensional images and inverse models that map high-dimensional images to state in robotics. Specifically, we present a perceptual model for generating video frames from state with deep networks, and provide a framework for its use in tracking and prediction tasks. We show that our proposed model greatly outperforms standard deconvolutional methods and GANs for image generation, producing clear, photo-realistic images. We also develop a convolutional neural network model for state estimation and compare the result to an Extended Kalman Filter to estimate robot trajectories. We validate all models on a real robotic system.Comment: 8 pages, International Conference on Robotics and Automation (ICRA) 201
    corecore