3,444 research outputs found
Text to 3D Scene Generation with Rich Lexical Grounding
The ability to map descriptions of scenes to 3D geometric representations has
many applications in areas such as art, education, and robotics. However, prior
work on the text to 3D scene generation task has used manually specified object
categories and language that identifies them. We introduce a dataset of 3D
scenes annotated with natural language descriptions and learn from this data
how to ground textual descriptions to physical objects. Our method successfully
grounds a variety of lexical terms to concrete referents, and we show
quantitatively that our method improves 3D scene generation over previous work
using purely rule-based methods. We evaluate the fidelity and plausibility of
3D scenes generated with our grounding approach through human judgments. To
ease evaluation on this task, we also introduce an automated metric that
strongly correlates with human judgments.Comment: 10 pages, 7 figures, 3 tables. To appear in ACL-IJCNLP 201
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation
The task of a visual landmark recognition system is to identify photographed
buildings or objects in query photos and to provide the user with relevant
information on them. With their increasing coverage of the world's landmark
buildings and objects, Internet photo collections are now being used as a
source for building such systems in a fully automatic fashion. This process
typically consists of three steps: clustering large amounts of images by the
objects they depict; determining object names from user-provided tags; and
building a robust, compact, and efficient recognition index. To this date,
however, there is little empirical information on how well current approaches
for those steps perform in a large-scale open-set mining and recognition task.
Furthermore, there is little empirical information on how recognition
performance varies for different types of landmark objects and where there is
still potential for improvement. With this paper, we intend to fill these gaps.
Using a dataset of 500k images from Paris, we analyze each component of the
landmark recognition pipeline in order to answer the following questions: How
many and what kinds of objects can be discovered automatically? How can we best
use the resulting image clusters to recognize the object in a query? How can
the object be efficiently represented in memory for recognition? How reliably
can semantic information be extracted? And finally: What are the limiting
factors in the resulting pipeline from query to semantics? We evaluate how
different choices of methods and parameters for the individual pipeline steps
affect overall system performance and examine their effects for different query
categories such as buildings, paintings or sculptures
Recommended from our members
A domain independent adaptive imaging system for visual inspection
Computer vision is a rapidly growing area. The range of applications is increasing very quickly, robotics, inspection, medicine, physics and document processing are all computer vision applications still in their infancy. All these applications are written with a specific task in mind and do not perform well unless there under a controlled environment. They do not deploy any knowledge to produce a meaningful description of the scene, or indeed aid in the analysis of the image.
The construction of a symbolic description of a scene from a digitised image is a difficult problem. A symbolic interpretation of an image can be viewed as a mapping from the image pixels to an identification of the semantically relevant objects. Before symbolic reasoning can take place image processing and segmentation routines must produce the relevant information. This part of the imaging system inherently introduces many errors. The aim of this project is to reduce the error rate produced by such algorithms and make them adaptable to change in the manufacturing process. Thus a prior knowledge is needed about the image and the objects they contain as well as knowledge about how the image was acquired from the scene (image geometry, quality, object decomposition, lighting conditions etc,). Knowledge on algorithms must also be acquired. Such knowledge is collected by studying the algorithms and deciding in which areas of image analysis they work well in.
In most existing image analysis systems, knowledge of this kind is implicitly embedded into the algorithms employed in the system. Such an approach assumes that all these parameters are invariant. However, in complex applications this may not be the case, so that adjustment must be made from time to time to ensure a satisfactory performance of the system. A system that allows for such adjustments to be made, must comprise the explicit representation of the knowledge utilised in the image analysis procedure.
In addition to the use of a priori knowledge, rules are employed to improve the performance of the image processing and segmentation algorithms. These rules considerably enhance the correctness of the segmentation process.
The most frequently given goal, if not the only one in industrial image analysis is to detect and locate objects of a given type in the image. That is, an image may contain objects of different types, and the goal is to identify parts of the image. The system developed here is driven by these goals, and thus by teaching the system a new object or fault in an object the system may adapt the algorithms to detect these new objects as well compromise for changes in the environment such as a change in lighting conditions. We have called this system the Visual Planner, this is due to the fact that we use techniques based on planning to achieve a given goal.
As the Visual Planner learns the specific domain it is working in, appropriate algorithms are selected to segment the object. This makes the system domain independent, because different algorithms may be selected for different applications and objects under different environmental condition
Visualizing Shakespeare: Iconography and Interpretation in the Works of Salvador DalĂ
Although William Shakespeare’s 16th century classical literature is rarely contextualized with the eccentricities of 20th century artist Salvador Dali, Shakespeare’s myriad of works have withstood the test of time and continue to be celebrated and reinterpreted by the likes of performers, scholars, and artists alike. Along with full-text illustrations of well-known plays, such as Macbeth (1946) and As You Like It (1953), Dali returned to the Shakespearean motif with his two series of dry-point engravings (Much Ado About Shakespeare and Shakespeare II) in 1968 and 1971. The series combine to formulate 31 depictions where Dali interprets Shakespeare’s text in a single image with classics like Romeo & Juliet as well as some of Shakespeare’s more obscure plays, such as Troilus and Cressida and Timon of Athens. Gettysburg College owns several of these prints, housed in the library’s Special Collections. Troilus and Cressida and Timon of Athens were on display in Schmucker Art Gallery as part of the Method and Meaning exhibit in the fall of 2014.
Shakespeare’s plays are an eclectic repertoire of iconic characters such as Prince Hamlet and Othello as well as timeless themes (both comic and tragic) that easily lend themselves to an extraordinary diverse range of illustrations; from the 18th century historical narratives of Francis Hayman, 19th century whimsical paintings of William Blake, Victorian renditions of John Everett Millais, and then eventually leading to the 20th expressive freedom of Dali. Salvador Dali’s representations, like his predecessors, aim to capture the essence of each Shakespeare play using specific iconographic elements in order to create a visual narration, bringing together the interpretations of the author, artist, and the viewer
In-loop Feature Tracking for Structure and Motion with Out-of-core Optimization
In this paper, a novel and approach for obtaining 3D models from video sequences captured with hand-held cameras is addressed. We define a pipeline that robustly deals with different types of sequences and acquiring devices. Our system follows a divide and conquer approach: after a frame decimation that pre-conditions the input sequence, the video is split into short-length clips. This allows to parallelize the reconstruction step which translates into a reduction in the amount of computational resources required. The short length of the clips allows an intensive search for the best solution at each step of reconstruction which robustifies the system. The process of feature tracking is embedded within the reconstruction loop for each clip as opposed to other approaches. A final registration step, merges all the processed clips to the same coordinate fram
Iconic Indexing for Video Search
Submitted for the degree of Doctor of Philosophy, Queen Mary, University of London
Image Processing and its Military Applications
One of the important breakthroughs, image processing is the stand alone, non-human image understanding system (IUS). The task of understanding images becomes monumental as one tries to define what understanding really is. Both pattern recognition and artificial intelligence are used in addition to traditional signal processing. Scene analysis procedures using edge and texture segmentation can be considered as the early stages of image understanding process. Symbolic representation and relationship grammers come at subsequent stages. Thus it is not reasonable to put a man into a loop of signal processing at certain sensors such as remotely piloted vehicles, satellites and spacecrafts. Consequently smart sensors and semi-automatic processes are being developed. Land remote sensing has been another important application of the image processing. With the introduction of programmes like Star Wars this particular application has gained a special importance from the Military's point of view. This paper provides an overview of digital image processing and explores the scope of the technology of remote sensing and IUSs from the Military's point of view. An example of the autonomous vehicle project now under progress in the US is described in detail to elucidate the impact of IUSs
- …