732 research outputs found

    Clip art retrieval combining raster and vector methods

    Full text link
    Abstract—Clip art databases can be composed by raster im-ages or by vector drawings. There are technologies for searching and retrieving clip arts for both image formats but research has been done separately, focusing on either format, without taking benefits of both research fields as a whole. This paper describes a study where the benefits of combining information extracted from vector and raster images to retrieve clip arts are evaluated and discussed. Color and texture features are extracted from raster images and geometry and topology features are extracted from vector images. The paper presents several comparisons between different combinations with several descriptors. The results of the study show the effectiveness of the solutions that combines both types of features. I

    Trends and concerns in digital cartography

    Get PDF
    CISRG discussion paper ;

    Towards Practicality of Sketch-Based Visual Understanding

    Full text link
    Sketches have been used to conceptualise and depict visual objects from pre-historic times. Sketch research has flourished in the past decade, particularly with the proliferation of touchscreen devices. Much of the utilisation of sketch has been anchored around the fact that it can be used to delineate visual concepts universally irrespective of age, race, language, or demography. The fine-grained interactive nature of sketches facilitates the application of sketches to various visual understanding tasks, like image retrieval, image-generation or editing, segmentation, 3D-shape modelling etc. However, sketches are highly abstract and subjective based on the perception of individuals. Although most agree that sketches provide fine-grained control to the user to depict a visual object, many consider sketching a tedious process due to their limited sketching skills compared to other query/support modalities like text/tags. Furthermore, collecting fine-grained sketch-photo association is a significant bottleneck to commercialising sketch applications. Therefore, this thesis aims to progress sketch-based visual understanding towards more practicality.Comment: PhD thesis successfully defended by Ayan Kumar Bhunia, Supervisor: Prof. Yi-Zhe Song, Thesis Examiners: Prof Stella Yu and Prof Adrian Hilto

    Fine Art Pattern Extraction and Recognition

    Get PDF
    This is a reprint of articles from the Special Issue published online in the open access journal Journal of Imaging (ISSN 2313-433X) (available at: https://www.mdpi.com/journal/jimaging/special issues/faper2020)

    Retrieval from an image knowledge base

    Get PDF
    With advances in computer technology, images and image databases are becoming increasingly important. Retrievals of images in current image database systems have been designed using keyword searches. These carefully designed and handcrafted systems are very efficient given the application domain they are built for. Unfortunately, they are not adaptable to other domains, not expandable for other uses of the existing information and are not very forgiving to their users. The appearance of full-text search provides for a more general search given textual documents. However, pictorial images contain a vast amount of information that is difficult to catalog in a general way. Further this classification needs to be dynamic providing for flexible searching capability. The searching should allow for more than a pre-programmed set of search parameters, as exact searches make the image database quite useless for a search that was not designed into the original database. Further the incorporation of knowledge along with the images is difficult. Development of an image knowledge base along with content-based retrieval techniques is the focus of this thesis. Using an artificial intelligence technique called case-based reasoning, images can be retrieved with a degree of flexibility. Each image would be classified by user entered attributes about the image called descriptors. These descriptors would also have a degree-of-importance parameter. This parameter would indicate the relative importance or certainty of that descriptor. These descriptors are collected as the case for the image and stored in frames Each image can vary as to the amount of attribute information they contain. Retrieval of an image from the knowledge base begins with the entry of new descriptors for the desired image. Along with the descriptors are the degree-of-importance parameter. The degree-of-importance would indicate the requirement for the desired image to match that descriptor. Again, a variable number of descriptors can be entered. After all criteria are entered, the system will search for cases that have any level of matching. The system will use the degree-of-importance both in the knowledge base about the candidate image(s) and the degree-of-importance on the search criteria to order the images. The ordering process will use weighted summations to present a relatively small list of candidate images. To demonstrate and validate the concepts outlined, a prototype of the system has been developed. This prototype includes the primary architectural components of a potentially real product. Architectural areas addressed are: the storage of the knowledge, storage and access to a large number of high-resolution images, means of searching or interrogating the knowledge base, and the actual display of images. The prototype is called the Smart Photo Album It is an electronic filing system for 35mm pictures taken by the average photographer on up to the photo-journalist. It allows for multiple ways of indexing the pictures of any subject matter. Retrieval from the knowledge base provides relative matches to the given search criteria. Although this application is relatively simple, the basis of the system can be easily extended to include a more sophisticated knowledge base and reasoning process as, for example, would be used for a medical diagnostic application in the field of dermatology

    Deep Learning for Free-Hand Sketch: A Survey

    Get PDF
    Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present. The recent prevalence of touchscreen devices has made sketch creation a much easier task than ever and consequently made sketch-oriented applications increasingly popular. The progress of deep learning has immensely benefited free-hand sketch research and applications. This paper presents a comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable. The main contents of this survey include: (i) A discussion of the intrinsic traits and unique challenges of free-hand sketch, to highlight the essential differences between sketch data and other data modalities, e.g., natural photos. (ii) A review of the developments of free-hand sketch research in the deep learning era, by surveying existing datasets, research topics, and the state-of-the-art methods through a detailed taxonomy and experimental evaluation. (iii) Promotion of future work via a discussion of bottlenecks, open problems, and potential research directions for the community.Comment: This paper is accepted by IEEE TPAM

    Automatic Building Roof Plane Extraction in Urban Environments for 3D City Modelling Using Remote Sensing Data

    Get PDF
    Delineating and modelling building roof plane structures is an active research direction in urban-related studies, as understanding roof structure provides essential information for generating highly detailed 3D building models. Traditional deep-learning models have been the main focus of most recent research endeavors aiming to extract pixel-based building roof plane areas from remote-sensing imagery. However, significant challenges arise, such as delineating complex roof boundaries and invisible boundaries. Additionally, challenges during the post-processing phase, where pixel-based building roof plane maps are vectorized, often result in polygons with irregular shapes. In order to address this issue, this study explores a state-of-the-art method for planar graph reconstruction applied to building roof plane extraction. We propose a framework for reconstructing regularized building roof plane structures using aerial imagery and cadastral information. Our framework employs a holistic edge classification architecture based on an attention-based neural network to detect corners and edges between them from aerial imagery. Our experiments focused on three distinct study areas characterized by different roof structure topologies: the Stadsveld–‘t Zwering neighborhood and Oude Markt area, located in Enschede, The Netherlands, and the Lozenets district in Sofia, Bulgaria. The outcomes of our experiments revealed that a model trained with a combined dataset of two different study areas demonstrated a superior performance, capable of delineating edges obscured by shadows or canopy. Our experiment in the Oude Markt area resulted in building roof plane delineation with an F-score value of 0.43 when the model trained on the combined dataset was used. In comparison, the model trained only on the Stadsveld–‘t Zwering dataset achieved an F-score value of 0.37, and the model trained only on the Lozenets dataset achieved an F-score value of 0.32. The results from the developed approach are promising and can be used for 3D city modelling in different urban settings

    Audio-Visual Learning for Scene Understanding

    Get PDF
    Multimodal deep learning aims at combining the complementary information of different modalities. Among all modalities, audio and video are the predominant ones that humans use to explore the world. In this thesis, we decided to focus our study on audio-visual deep learning to mimic with our networks how humans perceive the world. Our research includes images, audio signals and acoustic images. The latter provide spatial audio information and are obtained from a planar array of microphones combining their raw audios with the beamforming algorithm. They better mimic human auditory systems, which cannot be replicated using just one microphone, not able alone to give spatial sound cues. However, as microphones arrays are not so widespread, we also study how to handle the missing spatialized audio modality at test time. As a solution, we propose to distill acoustic images content to audio features during the training in order to handle their absence at test time. This is done for supervised audio classification using the generalized distillation framework, which we also extend for self-supervised learning. Next, we devise a method for reconstructing acoustic images given a single microphone and an RGB frame. Therefore, in case we just dispose of a standard video, we are able to synthesize spatial audio, which is useful for many audio-visual tasks, including sound localization. Lastly, as another example of restoring one modality from available ones, we inpaint degraded images providing audio features, to reconstruct the missing region not only to be visually plausible but also semantically consistent with the related sound. This includes also cross-modal generation, in the limit case of completely missing or hidden visual modality: our method naturally deals with it, being able to generate images from sound. In summary we show how audio can help visual learning and vice versa, by transferring knowledge between the two modalities at training time, in order to distill, reconstruct, or restore the missing modality at test time

    Multimedia Analysis and Access of Ancient Maya Epigraphy

    Get PDF
    This article presents an integrated framework for multimedia access and analysis of ancient Maya epigraphic resources, which is developed as an interdisciplinary effort involving epigraphers (someone who deciphers ancient inscriptions) and computer scientists. Our work includes several contributions: a definition of consistent conventions to generate high-quality representations of Maya hieroglyphs from the three most valuable ancient codices, which currently reside in European museums and institutions; a digital repository system for glyph annotation and management; as well as automatic glyph retrieval and classification methods. We study the combination of statistical Maya language models and shape representation within a hieroglyph retrieval system, the impact of applying language models extracted from different hieroglyphic resources on various data types, and the effect of shape representation choices for glyph classification. A novel Maya hieroglyph data set is given, which can be used for shape analysis benchmarks, and also to study the ancient Maya writing system
    • …
    corecore