11,665 research outputs found

    Survey of Techniques for Producing Blended Images: A Case Study Using Rollins College Archives

    Get PDF
    The Rollins College Archives are a treasure trove of historic resources relating to the college’s history, and they are often underutilized or overlooked by both the student body and the surrounding community. In particular, historic resources are all too often excluded from work in computer science and related fields. This project aims to bridge that gap by bringing the two areas together. To that end, the goal of this project is to merge past and present by blending historic photos with input of a present day scene in order to reveal changes and juxtapositions of the same scene across eras. This research explores the possibility of accomplishing this principally through computational means. In order to achieve this, we delve into the domain of computer vision, utilizing techniques in feature detection and matching in order to ultimately blend images in novel ways. Image blending is a technique often used for the creation of unique images, or for emphasizing a contrast between two scenes through their convergence. Whether the blend is produced through masks with alpha values, seam carving, or other techniques, most implementations require a great deal of manual input, whether that entails point selection, mask generation, or setting an alpha value. In this project, we identify recognizable regions and features on a given image. We then use these to identify similar regions and features in a second image. Any matches found are then filtered, and the bad or incorrect matches are removed. The remaining matches are used to compute the difference in perspectives between the two images, and the coordinates of the matching points are used to correct the images to match in the same perspective. We explore various approaches to the problem of feature matching, including built-in library functions, as well as a region based, template-matching algorithm. We also investigate techniques in image blending, such as automatic mask generation, Laplacian pyramid blending, and various off-the-shelf tools contained within Unity. We also test the applications of our findings with regards to working with 360-degree images

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation

    Full text link
    Current 3D open-vocabulary scene understanding methods mostly utilize well-aligned 2D images as the bridge to learn 3D features with language. However, applying these approaches becomes challenging in scenarios where 2D images are absent. In this work, we introduce a completely new pipeline, namely, OpenIns3D, which requires no 2D image inputs, for 3D open-vocabulary scene understanding at the instance level. The OpenIns3D framework employs a "Mask-Snap-Lookup" scheme. The "Mask" module learns class-agnostic mask proposals in 3D point clouds. The "Snap" module generates synthetic scene-level images at multiple scales and leverages 2D vision language models to extract interesting objects. The "Lookup" module searches through the outcomes of "Snap" with the help of Mask2Pixel maps, which contain the precise correspondence between 3D masks and synthetic images, to assign category names to the proposed masks. This 2D input-free, easy-to-train, and flexible approach achieved state-of-the-art results on a wide range of indoor and outdoor datasets with a large margin. Furthermore, OpenIns3D allows for effortless switching of 2D detectors without re-training. When integrated with state-of-the-art 2D open-world models such as ODISE and GroundingDINO, superb results are observed on open-vocabulary instance segmentation. When integrated with LLM-powered 2D models like LISA, it demonstrates a remarkable capacity to process highly complex text queries, including those that require intricate reasoning and world knowledge. Project page: https://zheninghuang.github.io/OpenIns3D/Comment: 24 pages, 16 figures, 13 tables. Project page: https://zheninghuang.github.io/OpenIns3D

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Towards Content-based Pixel Retrieval in Revisited Oxford and Paris

    Full text link
    This paper introduces the first two pixel retrieval benchmarks. Pixel retrieval is segmented instance retrieval. Like semantic segmentation extends classification to the pixel level, pixel retrieval is an extension of image retrieval and offers information about which pixels are related to the query object. In addition to retrieving images for the given query, it helps users quickly identify the query object in true positive images and exclude false positive images by denoting the correlated pixels. Our user study results show pixel-level annotation can significantly improve the user experience. Compared with semantic and instance segmentation, pixel retrieval requires a fine-grained recognition capability for variable-granularity targets. To this end, we propose pixel retrieval benchmarks named PROxford and PRParis, which are based on the widely used image retrieval datasets, ROxford and RParis. Three professional annotators label 5,942 images with two rounds of double-checking and refinement. Furthermore, we conduct extensive experiments and analysis on the SOTA methods in image search, image matching, detection, segmentation, and dense matching using our pixel retrieval benchmarks. Results show that the pixel retrieval task is challenging to these approaches and distinctive from existing problems, suggesting that further research can advance the content-based pixel-retrieval and thus user search experience. The datasets can be downloaded from \href{https://github.com/anguoyuan/Pixel_retrieval-Segmented_instance_retrieval}{this link}

    Design of Immersive Online Hotel Walkthrough System Using Image-Based (Concentric Mosaics) Rendering

    Get PDF
    Conventional hotel booking websites only represents their services in 2D photos to show their facilities. 2D photos are just static photos that cannot be move and rotate. Imagebased virtual walkthrough for the hospitality industry is a potential technology to attract more customers. In this project, a research will be carried out to create an Image-based rendering (IBR) virtual walkthrough and panoramic-based walkthrough by using only Macromedia Flash Professional 8, Photovista Panorama 3.0 and Reality Studio for the interaction of the images. The web-based of the image-based are using the Macromedia Dreamweaver Professional 8. The images will be displayed in Adobe Flash Player 8 or higher. In making image-based walkthrough, a concentric mosaic technique is used while image mosaicing technique is applied in panoramic-based walkthrough. A comparison of the both walkthrough is compared. The study is also focus on the comparison between number of pictures and smoothness of the walkthrough. There are advantages of using different techniques such as image-based walkthrough is a real time walkthrough since the user can walk around right, left, forward and backward whereas the panoramic-based cannot experience real time walkthrough because the user can only view 360 degrees from a fixed spot

    The Persistence of Austen in the 21st Century: A Reception History of The Lizzie Bennet Diaries

    Get PDF
    In 2012, the first episode of The Lizzie Bennet Diaries aired online via YouTube.com, offering a modernized serial form of Jane Austen’s 1813 novel Pride and Prejudice. With only word-of-mouth marketing, this series gained hundreds of thousands of views, a loyal following, and an Emmy award. In this paper, I will explore the reception history of The Lizzie Bennet Diaries by referencing its source material, analyzing its target demographics, and explaining its success

    Deep Learning Perspectives on Efficient Image Matching in Natural Image Databases

    Get PDF
    With the proliferation of digital content, efficient image matching in natural image databases has become paramount. Traditional image matching techniques, while effective to a certain extent, face challenges in dealing with the high variability inherent in natural images. This research delves into the application of deep learning models, particularly Convolutional Neural Networks (CNNs), Siamese Networks, and Triplet Networks, to address these challenges. We introduce various techniques to enhance efficiency, such as data augmentation, transfer learning, dimensionality reduction, efficient sampling, and the amalgamation of traditional computer vision strategies with deep learning. Our experimental results, garnered from specific dataset, demonstrate significant improvements in image matching efficiency, as quantified by metrics like precision, recall, F1-Score, and matching time. The findings underscore the potential of deep learning as a transformative tool for natural image database matching, setting the stage for further research and optimization in this domain
    • …
    corecore