17,223 research outputs found
Algorithmic Perception of Vertices in Sketched Drawings of Polyhedral Shapes
In this article, visual perception principles were used to build an artificial perception model aimed at developing an algorithm for detecting junctions in line drawings of polyhedral objects that are vectorized from hand-drawn sketches. The detection is performed in two dimensions (2D), before any 3D model is available and minimal information about the shape depicted by the sketch is used. The goal of this approach is to not only detect junctions in careful sketches created by skilled engineers and designers but also detect junctions when skilled people draw casually to quickly convey rough ideas. Current approaches for extracting junctions from digital images are mostly incomplete, as they simply merge endpoints that are near each other, thus ignoring the fact that different vertices may be represented by different (but close) junctions and that the endpoints of lines that depict edges that share a common vertex may not necessarily be close to each other, particularly in quickly sketched drawings. We describe and validate a new algorithm that uses these perceptual findings to merge tips of line segments into 2D junctions that are assumed to depict 3D vertices
A Similarity Measure for Material Appearance
We present a model to measure the similarity in appearance between different
materials, which correlates with human similarity judgments. We first create a
database of 9,000 rendered images depicting objects with varying materials,
shape and illumination. We then gather data on perceived similarity from
crowdsourced experiments; our analysis of over 114,840 answers suggests that
indeed a shared perception of appearance similarity exists. We feed this data
to a deep learning architecture with a novel loss function, which learns a
feature space for materials that correlates with such perceived appearance
similarity. Our evaluation shows that our model outperforms existing metrics.
Last, we demonstrate several applications enabled by our metric, including
appearance-based search for material suggestions, database visualization,
clustering and summarization, and gamut mapping.Comment: 12 pages, 17 figure
The Sound Manifesto
Computing practice today depends on visual output to drive almost all user
interaction. Other senses, such as audition, may be totally neglected, or used
tangentially, or used in highly restricted specialized ways. We have excellent
audio rendering through D-A conversion, but we lack rich general facilities for
modeling and manipulating sound comparable in quality and flexibility to
graphics. We need co-ordinated research in several disciplines to improve the
use of sound as an interactive information channel.
Incremental and separate improvements in synthesis, analysis, speech
processing, audiology, acoustics, music, etc. will not alone produce the
radical progress that we seek in sonic practice. We also need to create a new
central topic of study in digital audio research. The new topic will assimilate
the contributions of different disciplines on a common foundation. The key
central concept that we lack is sound as a general-purpose information channel.
We must investigate the structure of this information channel, which is driven
by the co-operative development of auditory perception and physical sound
production. Particular audible encodings, such as speech and music, illuminate
sonic information by example, but they are no more sufficient for a
characterization than typography is sufficient for a characterization of visual
information.Comment: To appear in the conference on Critical Technologies for the Future
of Computing, part of SPIE's International Symposium on Optical Science and
Technology, 30 July to 4 August 2000, San Diego, C
SAVOIAS: A Diverse, Multi-Category Visual Complexity Dataset
Visual complexity identifies the level of intricacy and details in an image
or the level of difficulty to describe the image. It is an important concept in
a variety of areas such as cognitive psychology, computer vision and
visualization, and advertisement. Yet, efforts to create large, downloadable
image datasets with diverse content and unbiased groundtruthing are lacking. In
this work, we introduce Savoias, a visual complexity dataset that compromises
of more than 1,400 images from seven image categories relevant to the above
research areas, namely Scenes, Advertisements, Visualization and infographics,
Objects, Interior design, Art, and Suprematism. The images in each category
portray diverse characteristics including various low-level and high-level
features, objects, backgrounds, textures and patterns, text, and graphics. The
ground truth for Savoias is obtained by crowdsourcing more than 37,000 pairwise
comparisons of images using the forced-choice methodology and with more than
1,600 contributors. The resulting relative scores are then converted to
absolute visual complexity scores using the Bradley-Terry method and matrix
completion. When applying five state-of-the-art algorithms to analyze the
visual complexity of the images in the Savoias dataset, we found that the
scores obtained from these baseline tools only correlate well with crowdsourced
labels for abstract patterns in the Suprematism category (Pearson correlation
r=0.84). For the other categories, in particular, the objects and advertisement
categories, low correlation coefficients were revealed (r=0.3 and 0.56,
respectively). These findings suggest that (1) state-of-the-art approaches are
mostly insufficient and (2) Savoias enables category-specific method
development, which is likely to improve the impact of visual complexity
analysis on specific application areas, including computer vision.Comment: 10 pages, 4 figures, 4 table
Efficient Analysis of Complex Diagrams using Constraint-Based Parsing
This paper describes substantial advances in the analysis (parsing) of
diagrams using constraint grammars. The addition of set types to the grammar
and spatial indexing of the data make it possible to efficiently parse real
diagrams of substantial complexity. The system is probably the first to
demonstrate efficient diagram parsing using grammars that easily be retargeted
to other domains. The work assumes that the diagrams are available as a flat
collection of graphics primitives: lines, polygons, circles, Bezier curves and
text. This is appropriate for future electronic documents or for vectorized
diagrams converted from scanned images. The classes of diagrams that we have
analyzed include x,y data graphs and genetic diagrams drawn from the biological
literature, as well as finite state automata diagrams (states and arcs). As an
example, parsing a four-part data graph composed of 133 primitives required 35
sec using Macintosh Common Lisp on a Macintosh Quadra 700.Comment: 9 pages, Postscript, no fonts, compressed, uuencoded. Composed in
MSWord 5.1a for the Mac. To appear in ICDAR '95. Other versions at
ftp://ftp.ccs.neu.edu/pub/people/futrell
Recommended from our members
Enactivism and ethnomethodological conversation analysis as tools for expanding Universal Design for Learning: the case of visually impaired mathematics students
Blind and visually impaired mathematics students must rely on accessible materials such as tactile diagrams to learn mathematics. However, these compensatory materials are frequently found to offer students inferior opportunities for engaging in mathematical practice and do not allow sensorily heterogenous students to collaborate. Such prevailing problems of access and interaction are central concerns of Universal Design for Learning (UDL), an engineering paradigm for inclusive participation in cultural praxis like mathematics. Rather than directly adapt existing artifacts for broader usage, UDL process begins by interrogating the praxis these artifacts serve and then radically re-imagining tools and ecologies to optimize usability for all learners. We argue for the utility of two additional frameworks to enhance UDL efforts: (a) enactivism, a cognitive-sciences view of learning, knowing, and reasoning as modal activity; and (b) ethnomethodological conversation analysis (EMCA), which investigates participantsâ multimodal methods for coordinating action and meaning. Combined, these approaches help frame the design and evaluation of opportunities for heterogeneous students to learn mathematics collaboratively in inclusive classrooms by coordinating perceptuo-motor solutions to joint manipulation problems. We contextualize the thesis with a proposal for a pluralist design for proportions, in which a pair of students jointly operate an interactive technological device
Effectiveness of Visualisations for Detection of Errors in Segmentation of Blood Vessels
Vascular disease diagnosis often requires a precise segmentation of the vessel lumen. When 3D (Magnetic Resonance Angiography, MRA, or Computed Tomography Angiography, CTA) imaging is available, this can be done automatically, but occasional errors are inevitable. So, the segmentation has to be checked by clinicians. This requires appropriate visualisation techniques. A number of visualisation techniques exist, but there has been little in the way of user studies that compare the different alternatives. In this study we examine how users interact with several basic visualisations, when performing a visual search task, checking vascular segmentation correctness of segmented MRA data. These visualisations are: direct volume rendering (DVR), isosurface rendering, and curved planar reformatting (CPR). Additionally, we examine if visual highlighting of potential errors can help the user find errors, so a fourth visualisation we examine is DVR with visual highlighting. Our main findings are that CPR performs fastest but has higher error rate, and there are no significant differences between the other three visualisations. We did find that visual highlighting actually has slower performance in early trials, suggesting that users learned to ignore them
- âŚ