764 research outputs found
Structure from Motion with Higher-level Environment Representations
Computer vision is an important area focusing on understanding,
extracting and using the information from vision-based sensor. It
has many applications such as vision-based 3D reconstruction,
simultaneous localization and mapping(SLAM) and data-driven
understanding of the real world. Vision is a fundamental sensing
modality in many different fields of application.
While the traditional structure from motion mostly uses sparse
point-based feature, this thesis aims to explore the possibility
of using higher order feature representation. It starts with a
joint work which uses straight line for feature representation
and performs bundle adjustment with straight line
parameterization. Then, we further try an even higher order
representation where we use Bezier spline for parameterization.
We start with a simple case where all contours are lying on the
plane and uses Bezier splines to parametrize the curves in the
background and optimize on both camera position and Bezier
splines. For application, we present a complete end-to-end
pipeline which produces meaningful dense 3D models from natural
data of a 3D object: the target object is placed on a structured
but unknown planar background that is modeled with splines. The
data is captured using only a hand-held monocular camera.
However, this application is limited to a planar scenario and we
manage to push the parameterizations into real 3D. Following the
potential of this idea, we introduce a more flexible higher-order
extension of points that provide a general model for structural
edges in the environment, no matter if straight or curved. Our
model relies on linked B´ezier curves, the geometric intuition
of which proves great benefits during parameter initialization
and regularization. We present the
first fully automatic pipeline that is able to generate
spline-based representations without any human supervision.
Besides a full graphical formulation of the problem, we introduce
both geometric and photometric cues as well as higher-level
concepts such overall curve visibility and viewing angle
restrictions to automatically manage the correspondences in the
graph. Results prove that curve-based structure from motion with
splines is able to outperform state-of-the-art sparse
feature-based methods, as well as to model curved edges in the
environment
A complete hand-drawn sketch vectorization framework
Vectorizing hand-drawn sketches is a challenging task, which is of paramount
importance for creating CAD vectorized versions for the fashion and creative
workflows. This paper proposes a complete framework that automatically
transforms noisy and complex hand-drawn sketches with different stroke types in
a precise, reliable and highly-simplified vectorized model. The proposed
framework includes a novel line extraction algorithm based on a
multi-resolution application of Pearson's cross correlation and a new unbiased
thinning algorithm that can get rid of scribbles and variable-width strokes to
obtain clean 1-pixel lines. Other contributions include variants of pruning,
merging and edge linking procedures to post-process the obtained paths.
Finally, a modification of the original Schneider's vectorization algorithm is
designed to obtain fewer control points in the resulting Bezier splines. All
the proposed steps of the framework have been extensively tested and compared
with state-of-the-art algorithms, showing (both qualitatively and
quantitatively) its outperformance
تمثيل الإطار الخارجي للكلمات العربية بكفاءة من خلال الدمج بين نموذج الكنتور النشط وتحديد ونقاط الزوايا
Graphical curves and surfaces fitting are hot areas of research studies and application, such as artistic applications, analysis applications and encoding purposes. Outline capture of digital word images is important in most of the desktop publishing systems. The shapes of the characters are stored in the computer memory in terms of their outlines, and the outlines are expressed as Bezier curves. Existing methods for Arabic font outline description suffer from low fitting accuracy and efficiency. In our research, we developed a new method for outlining shapes using Bezier curves with minimal set of curve points. A distinguishing characteristic of our method is that it combines the active contour method (snake) with corner detection to achieve an initial set of points that is as close to the shape's boundaries as possible. The method links these points (snake + corner) into a compound Bezier curve, and iteratively improves the fitting of the curve over the actual boundaries of the shape. We implemented and tested our method using MATLAB. Test cases included various levels of shape complexity varying from simple, moderate, and high complexity depending on factors, such as: boundary concavities, number of corners. Results show that our method achieved average 86% of accuracy when measured relative to true shape boundary. When compared to other similar methods (Masood & Sarfraz, 2009; Sarfraz & Khan, 2002; Ferdous A Sohel, Karmakar, Dooley, & Bennamoun, 2010), our method performed comparatively well. Keywords: Bezier curves, shape descriptor, curvature, corner points, control points, Active Contour Model.تعتبر المنحنيات والأسطح الرسومية موضوعاً هاماً في الدراسات البحثية وفي التطبيقات البرمجية مثل التطبيقات الفنية، وتطبيقات تحليل وترميز البيانات. ويعتبر تخطيط الحدود الخارجية للكلمات عملية أساسية في غالبية تطبيقات النشر المكتبي. في هذه التطبيقات تخزن أشكال الأحرف في الذاكرة من حيث خطوطها الخارجية، وتمثل الخطوط الخارجية على هيئة منحنيات Bezier. الطرق المستخدمة حالياً لتحديد الخطوط الخارجية للكلمات العربية تنقصها دقة وكفاءة الملاءمة ما بين الحدود الحقيقية والمنحنى الرسومي الذي تقوم بتشكيله. في هذا البحث قمنا بتطوير طريقة جديدة لتخطيط الحدود الخارجية للكلمات تعتمد على منحنيات Bezier بمجموعة أقل من المنحنيات الجزئية. تتميز طريقتنا بخاصية مميزة وهي الدمج بين آلية لاستشعار الزوايا مع آلية نموذج الكنتور النشط (الأفعى). يتم الدمج بين نقاط الزوايا ونقاط الأفعى لتشكيل مجموعة موحدة من النقاط المبدئية قريبة قدر الإمكان من الحدود الحقيقية للشكل المراد تحديده. يتشكل منحنى Bezier من هذه المجموعة المدمجة، وتتم عملية تدريجية على دورات لملاءمة المنحنى على الحدود الحقيقية للشكل. قام الباحث بتنفيذ وتجربة الطريقة الجديدة باستخدام برنامج MATLAB. وتم اختيار أشكال رسومية كعينات اختبار تتصف بمستويات متباينة من التعقيد تتراوح ما بين بسيط إلى متوسط إلى عالي التعقيد على أساس عوامل مثل تقعرات الحدود، عدد نقاط الزوايا، الفتحات الداخلية، إلخ. وقد أظهرت نتائج الاختبار أن طريقتنا الجديدة حققت دقة في الملائمة تصل نسبتها إلى 86% مقارنة بالحدود الحقيقية للشكل المستهدف. وكذلك فقد كان أداء طريقتنا جيداً بالمقارنة مع طرق أخرى مماثلة
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
End-to-end text spotting aims to integrate scene text detection and
recognition into a unified framework. Dealing with the relationship between the
two sub-tasks plays a pivotal role in designing effective spotters. Although
transformer-based methods eliminate the heuristic post-processing, they still
suffer from the synergy issue between the sub-tasks and low training
efficiency. In this paper, we present DeepSolo, a simple detection transformer
baseline that lets a single Decoder with Explicit Points Solo for text
detection and recognition simultaneously. Technically, for each text instance,
we represent the character sequence as ordered points and model them with
learnable explicit point queries. After passing a single decoder, the point
queries have encoded requisite text semantics and locations and thus can be
further decoded to the center line, boundary, script, and confidence of text
via very simple prediction heads in parallel, solving the sub-tasks in text
spotting in a unified framework. Besides, we also introduce a text-matching
criterion to deliver more accurate supervisory signals, thus enabling more
efficient training. Quantitative experiments on public benchmarks demonstrate
that DeepSolo outperforms previous state-of-the-art methods and achieves better
training efficiency. In addition, DeepSolo is also compatible with line
annotations, which require much less annotation cost than polygons. The code
will be released.Comment: The code will be available at
https://github.com/ViTAE-Transformer/DeepSol
A variational model for data fitting on manifolds by minimizing the acceleration of a B\'ezier curve
We derive a variational model to fit a composite B\'ezier curve to a set of
data points on a Riemannian manifold. The resulting curve is obtained in such a
way that its mean squared acceleration is minimal in addition to remaining
close the data points. We approximate the acceleration by discretizing the
squared second order derivative along the curve. We derive a closed-form,
numerically stable and efficient algorithm to compute the gradient of a
B\'ezier curve on manifolds with respect to its control points, expressed as a
concatenation of so-called adjoint Jacobi fields. Several examples illustrate
the capabilites and validity of this approach both for interpolation and
approximation. The examples also illustrate that the approach outperforms
previous works tackling this problem
Feature-Based Textures
This paper introduces feature-based textures, a new image
representation that combines features and texture samples for high-quality texture mapping. Features identify boundaries within a texture where samples change discontinuously. They can be extracted from vector graphics representations, or explicity added to raster images to improve sharpness. Texture lookups are then interpolated from samples while respecting these boundaries. We present results from a software implementation of this technique demonstrating quality, efficiency and low memory overhead
Deep Forward and Inverse Perceptual Models for Tracking and Prediction
We consider the problems of learning forward models that map state to
high-dimensional images and inverse models that map high-dimensional images to
state in robotics. Specifically, we present a perceptual model for generating
video frames from state with deep networks, and provide a framework for its use
in tracking and prediction tasks. We show that our proposed model greatly
outperforms standard deconvolutional methods and GANs for image generation,
producing clear, photo-realistic images. We also develop a convolutional neural
network model for state estimation and compare the result to an Extended Kalman
Filter to estimate robot trajectories. We validate all models on a real robotic
system.Comment: 8 pages, International Conference on Robotics and Automation (ICRA)
201
- …