Search CORE

89 research outputs found

Human perception in segmentation of sketches

Author: D. Hoffmann
D.L. Jenkins
J. Pu
K. Tombre
L. Gennari
M. Cooper
S.E. Palmer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

In this paper, we study the segmentation of sketched engineering drawings into a set of straight and curved segments. Our immediate objective is to produce a benchmarking method for segmentation algorithms. The criterion is to minimise the differences between what the algorithm detects and what human beings perceive. We have created a set of sketched drawings and have asked people to segment them. By analysis of the produced segmentations, we have obtained the number and locations of the segmentation points which people perceive. Evidence collected during our experiments supports useful hypotheses, for example that not all kinds of segmentation points are equally difficult to perceive. The resulting methodology can be repeated with other drawings to obtain a set of sketches and segmentation data which could be used as a benchmark for segmentation algorithms, to evaluate their capability to emulate human perception of sketches

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Combining appearance and context for multi-domain sketch recognition

Author: Ouyang Tom Yu
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 99-102).As our interaction with computing shifts away from the traditional desktop model (e.g., towards smartphones, tablets, touch-enabled displays), the technology that drives this interaction needs to evolve as well. Wouldn't it be great if we could talk, write, and draw to a computer just like we do with each other? This thesis addresses the drawing aspect of that vision: enabling computers to understand the meaning and semantics of free-hand diagrams. We present a novel framework for sketch recognition that seamlessly combines a rich representation of local visual appearance with a probabilistic graphical model for capturing higher level relationships. This joint model makes our system less sensitive to noise and drawing variations, improving accuracy and robustness. The result is a recognizer that is better able to handle the wide range of drawing styles found in messy freehand sketches. To preserve the fluid process of sketching on paper, our interface allows users to draw diagrams just as they would on paper, using the same notations and conventions. For the isolated symbol recognition task our method exceeds state-of-the-art performance in three domains: handwritten digits, PowerPoint shapes, and electrical circuit symbols. For the complete diagram recognition task it was able to achieve excellent performance on both chemistry and circuit diagrams, improving on the best previous results. Furthermore, in an on-line study our new interface was on average over twice as fast as the existing CAD-based method for authoring chemical diagrams, even for novice users who had little or no experience using a tablet. This is one of the first direct comparisons that shows a sketch recognition interface significantly outperforming a professional industry-standard CAD-based tool.by Tom Yu Ouyang.Ph.D

DSpace@MIT

CAD2Sketch: Generating Concept Sketches from CAD Sequences

Author: Bousseau Adrien
Hähnlein Felix
Li Changjian
Mitra Niloy,
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

International audienceConcept sketches are ubiquitous in industrial design, as they allow designers to quickly depict imaginary 3D objects. To construct their sketches with accurate perspective, designers rely on longstanding drawing techniques, including the use of auxiliary construction lines to identify midpoints of perspective planes, to align points vertically and horizontally, and to project planar curves from one perspective plane to another. We present a method to synthesize such construction lines from CAD sequences. Importantly, our method balances the presence of construction lines with overall clutter, such that the resulting sketch is both well-constructed and readable, as professional designers are trained to do. In addition to generating sketches that are visually similar to real ones, we apply our method to synthesize a large quantity of paired sketches and normal maps, and show that the resulting dataset can be used to train a neural network to infer normals from concept sketches

INRIA a CCSD electronic archive server

Feature Point Detection and Curve Approximation for Early Processing of Freehand Sketches

Author: Sezgin Tevfik Metin
Publication venue
Publication date: 01/01/2001
Field of study

Freehand sketching is both a natural and crucial part of design, yet is unsupported by current design automation software. We are working to combine the flexibility and ease of use of paper and pencil with the processing power of a computer to produce a design environment that feels as natural as paper, yet is considerably smarter. One of the most basic steps in accomplishing this is converting the original digitized pen strokes in the sketch into the intended geometric objects using feature point detection and approximation. We demonstrate how multiple sources of information can be combined for feature detection in strokes and apply this technique using two approaches to signal processing, one using simple average based thresholding and a second using scale space

DSpace@MIT

Application of Machine Learning within Visual Content Production

Author: Giunchi Daniele
Publication venue: UCL (University College London)
Publication date: 28/07/2021
Field of study

We are living in an era where digital content is being produced at a dazzling pace. The heterogeneity of contents and contexts is so varied that a numerous amount of applications have been created to respond to people and market demands. The visual content production pipeline is the generalisation of the process that allows a content editor to create and evaluate their product, such as a video, an image, a 3D model, etc. Such data is then displayed on one or more devices such as TVs, PC monitors, virtual reality head-mounted displays, tablets, mobiles, or even smartwatches. Content creation can be simple as clicking a button to film a video and then share it into a social network, or complex as managing a dense user interface full of parameters by using keyboard and mouse to generate a realistic 3D model for a VR game. In this second example, such sophistication results in a steep learning curve for beginner-level users. In contrast, expert users regularly need to refine their skills via expensive lessons, time-consuming tutorials, or experience. Thus, user interaction plays an essential role in the diffusion of content creation software, primarily when it is targeted to untrained people. In particular, with the fast spread of virtual reality devices into the consumer market, new opportunities for designing reliable and intuitive interfaces have been created. Such new interactions need to take a step beyond the point and click interaction typical of the 2D desktop environment. The interactions need to be smart, intuitive and reliable, to interpret 3D gestures and therefore, more accurate algorithms are needed to recognise patterns. In recent years, machine learning and in particular deep learning have achieved outstanding results in many branches of computer science, such as computer graphics and human-computer interface, outperforming algorithms that were considered state of the art, however, there are only fleeting efforts to translate this into virtual reality. In this thesis, we seek to apply and take advantage of deep learning models to two different content production pipeline areas embracing the following subjects of interest: advanced methods for user interaction and visual quality assessment. First, we focus on 3D sketching to retrieve models from an extensive database of complex geometries and textures, while the user is immersed in a virtual environment. We explore both 2D and 3D strokes as tools for model retrieval in VR. Therefore, we implement a novel system for improving accuracy in searching for a 3D model. We contribute an efficient method to describe models through 3D sketch via an iterative descriptor generation, focusing both on accuracy and user experience. To evaluate it, we design a user study to compare different interactions for sketch generation. Second, we explore the combination of sketch input and vocal description to correct and fine-tune the search for 3D models in a database containing fine-grained variation. We analyse sketch and speech queries, identifying a way to incorporate both of them into our system's interaction loop. Third, in the context of the visual content production pipeline, we present a detailed study of visual metrics. We propose a novel method for detecting rendering-based artefacts in images. It exploits analogous deep learning algorithms used when extracting features from sketches

UCL Discovery

Interfaces for creating quantitative conceptual diagrams

Author: Stewart Robin S. (Robin Scott)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2008
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 71-73).Modern chart-making, illustration, and mathematical tools poorly support the use of conceptual components in quantitative graphs such as Economics diagrams. The substantial time those tools require to achieve the desired results leads many people to sketch their graphs with pencil and paper instead of using a computer. In this thesis, I address the challenge of designing a software user interface that not only includes all features necessary to create a wide range of quantitative conceptual diagrams, but also is dramatically more efficient to use than existing programs. My design takes several important interaction techniques that previous applications used separately and comprehensively integrates them in order to create new, flexible capabilities. I have implemented this design as a desktop application called Graph Sketcher, and I present results of studies which show that my interface halves the time required to complete several common graph creation tasks. I also show that the 700 students, teachers, professionals, and hobbyists worldwide who choose to use Graph Sketcher in their everyday work nd the interface intuitive, enjoyable, and empowering for generating many different types of graphs.by Robin S. Stewart.S.M

DSpace@MIT

Recommended from our members

Style-driven Shape Analysis and Synthesis

Author: Lun Zhaoliang
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/11/2017
Field of study

In this dissertation I will investigate algorithms that analyze stylistic properties of 3D shapes and automatically synthesize shapes given style specifications. I will start by introducing a structure-transcending method for style similarity evaluation between 3D shapes. Inspired by observations about style similarity in art history literature, we propose an algorithmically computed style similarity measure which identifies style related elements on the analyzed models and collates element-level geometric similarity measurements into an object-level style measure consistent with human perception. To achieve this consistency we employ crowdsourcing to learn the relative perceptual importance of a range of elementary shape distances and other parameters used in our measurement from participant answers to cross-structure style similarity queries. I will then describe an algorithm that utilizes this learned style similarity measure to synthesize 3D models of man-made shapes. The algorithm combines user-specified style, described via an exemplar shape, and functionality, encoded by a functionally different target shape. We transfer the exemplar style to the target via a sequence of compatible element-level operations where the compatibility is a learned metric that estimates the impact of each operation on the edited shape. We use this metric to cast style transfer as a tabu search, which incrementally updates the target shape using compatible operations, progressively increasing its style similarity to the exemplar while strictly maintaining its functionality at each step. Finally I will propose a method for reconstructing 3D shapes following style aspects of given 2D drawings. Our method takes line drawings as input and converts them into surface depth and normal maps from several output viewpoints via a deep convolutional neural network with multi-view encoder-decoder architecture. The multi-view maps are then consolidated into a dense coherent 3D point cloud by solving an optimization problem that fuses depth and normal information across all output viewpoints. The output point cloud is then converted into a polygon mesh representation, which is further fine-tuned to match the input sketch more precisely

ScholarWorks@UMass Amherst

Improving Interaction in Visual Analytics using Machine Learning

Author: Fan Chaoran
Publication venue: The University of Bergen
Publication date: 01/01/2021
Field of study

Interaction is one of the most fundamental components in visual analytical systems, which transforms people from mere viewers to active participants in the process of analyzing and understanding data. Therefore, fast and accurate interaction techniques are key to establishing a successful human-computer dialogue, enabling a smooth visual data exploration. Machine learning is a branch of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It has been utilized in a wide variety of fields, where it is not straightforward to develop a conventional algorithm for effectively performing a task. Inspired by this, we see the opportunity to improve the current interactions in visual analytics by using machine learning methods. In this thesis, we address the need for interaction techniques that are both fast, enabling a fluid interaction in visual data exploration and analysis, and also accurate, i.e., enabling the user to effectively select specific data subsets. First, we present a new, fast and accurate brushing technique for scatterplots, based on the Mahalanobis brush, which we have optimized using data from a user study. Further, we present a new solution for a near-perfect sketch-based brushing technique, where we exploit a convolutional neural network (CNN) for estimating the intended data selection from a fast and simple click-and-drag interaction and from the data distribution in the visualization. Next, we propose an innovative framework which offers the user opportunities to improve the brushing technique while using it. We tested this framework with CNN-based brushing and the result shows that the underlying model can be refined (better performance in terms of accuracy) and personalized by very little time of retraining. Besides, in order to investigate to which degree the human should be involved into the model design and how good the empirical model can be with a more careful design, we extended our Mahalanobis brush (the best current empirical model in terms of accuracy for brushing points in a scatterplot) by further incorporating the data distribution information, captured by kernel density estimation (KDE). Based on this work, we then provide a detailed comparison between empirical modeling and implicit modeling by machine learning (deep learning). Lastly, we introduce a new, machine learning based approach that enables the fast and accurate querying of time series data based on a swift sketching interaction. To achieve this, we build upon existing LSTM technology (long short-term memory) to encode both the sketch and the time series data in two networks with shared parameters. All the proposed interaction techniques in this thesis were demonstrated by application examples and evaluated via user studies. The integration of machine learning knowledge into visualization opens further possible research directions.Doktorgradsavhandlin

University of Bergen

NORA - Norwegian Open Research Archives

Knowledge of knots: shapes in action

Author: Casati Roberto
Publication venue: Universal Logic
Publication date: 03/04/2013
Field of study

Logic is to natural language what knot theory is to natural knots. Logic is concerned with some cognitive performances; in particular, some natural language inferences are captured by various types of calculi (propositional, predicate, modal, deontic, quantum, probabilistic, etc.), which in turn may generate inferences that are arguably beyond natural logic abilities, or non-well synchronized therewith (eg. ex falso quodlibet, material implication). Mathematical knot theory accounts for some abilities - such as recognizing sameness or differences of some knots, and in turn generates a formalism for distinctions that common sense is blind to. Logic has proven useful in linguistics and in accounting for some aspects of reasoning, but which knotting performaces are there, over and beyond some intuitive discriminating abilities, that may require extensions or restrictions of the normative calculus of knots? Are they amenable to mathematical treatment? And what role is played in the game by mental representations? I shall draw from a corpus of techniques and practices to show to what extent compositionality, lexical and normative elements are present in natural knots, with the prospect of formally exploring an area of human competence that interfaces thought, perception and action in a complex fabric

CiteSeerX

Archive Electronique - Institut Jean Nicod

Semantics-Driven Large-Scale 3D Scene Retrieval

Author: Yuan Juefei
Publication venue: The Aquila Digital Community
Publication date: 01/08/2021
Field of study

Aquila Digital Community (University of Southern Mississippi, USM)