1,176 research outputs found

    Semi-automatic video object segmentation for multimedia applications

    Get PDF
    A semi-automatic video object segmentation tool is presented for segmenting both still pictures and image sequences. The approach comprises both automatic segmentation algorithms and manual user interaction. The still image segmentation component is comprised of a conventional spatial segmentation algorithm (Recursive Shortest Spanning Tree (RSST)), a hierarchical segmentation representation method (Binary Partition Tree (BPT)), and user interaction. An initial segmentation partition of homogeneous regions is created using RSST. The BPT technique is then used to merge these regions and hierarchically represent the segmentation in a binary tree. The semantic objects are then manually built by selectively clicking on image regions. A video object-tracking component enables image sequence segmentation, and this subsystem is based on motion estimation, spatial segmentation, object projection, region classification, and user interaction. The motion between the previous frame and the current frame is estimated, and the previous object is then projected onto the current partition. A region classification technique is used to determine which regions in the current partition belong to the projected object. User interaction is allowed for object re-initialisation when the segmentation results become inaccurate. The combination of all these components enables offline video sequence segmentation. The results presented on standard test sequences illustrate the potential use of this system for object-based coding and representation of multimedia

    User-activity aware strategies for mobile information access

    Get PDF
    Information access suffers tremendously in wireless networks because of the low correlation between content transferred across low-bandwidth wireless links and actual data used to serve user requests. As a result, conventional content access mechanisms face such problems as unnecessary bandwidth consumption and large response times, and users experience significant performance degradation. In this dissertation, we analyze the cause of those problems and find that the major reason for inefficient information access in wireless networks is the absence of any user-activity awareness in current mechanisms. To solve these problems, we propose three user-activity aware strategies for mobile information access. Through simulations and implementations, we show that our strategies can outperform conventional information access schemes in terms of bandwidth consumption and user-perceived response times.Ph.D.Committee Chair: Raghupathy Sivakumar; Committee Member: Chuanyi Ji; Committee Member: George Riley; Committee Member: Magnus Egerstedt; Committee Member: Umakishore Ramachandra

    Combined Industry, Space and Earth Science Data Compression Workshop

    Get PDF
    The sixth annual Space and Earth Science Data Compression Workshop and the third annual Data Compression Industry Workshop were held as a single combined workshop. The workshop was held April 4, 1996 in Snowbird, Utah in conjunction with the 1996 IEEE Data Compression Conference, which was held at the same location March 31 - April 3, 1996. The Space and Earth Science Data Compression sessions seek to explore opportunities for data compression to enhance the collection, analysis, and retrieval of space and earth science data. Of particular interest is data compression research that is integrated into, or has the potential to be integrated into, a particular space or earth science data information system. Preference is given to data compression research that takes into account the scien- tist's data requirements, and the constraints imposed by the data collection, transmission, distribution and archival systems

    The JPEG2000 still image compression standard

    Get PDF
    The development of standards (emerging and established) by the International Organization for Standardization (ISO), the International Telecommunications Union (ITU), and the International Electrotechnical Commission (IEC) for audio, image, and video, for both transmission and storage, has led to worldwide activity in developing hardware and software systems and products applicable to a number of diverse disciplines [7], [22], [23], [55], [56], [73]. Although the standards implicitly address the basic encoding operations, there is freedom and flexibility in the actual design and development of devices. This is because only the syntax and semantics of the bit stream for decoding are specified by standards, their main objective being the compatibility and interoperability among the systems (hardware/software) manufactured by different companies. There is, thus, much room for innovation and ingenuity. Since the mid 1980s, members from both the ITU and the ISO have been working together to establish a joint international standard for the compression of grayscale and color still images. This effort has been known as JPEG, the Join

    Performance Evaluation of Object Proposal Generators for Salient Object Detection

    Get PDF
    abstract: The detection and segmentation of objects appearing in a natural scene, often referred to as Object Detection, has gained a lot of interest in the computer vision field. Although most existing object detectors aim to detect all the objects in a given scene, it is important to evaluate whether these methods are capable of detecting the salient objects in the scene when constraining the number of proposals that can be generated due to constraints on timing or computations during execution. Salient objects are objects that tend to be more fixated by human subjects. The detection of salient objects is important in applications such as image collection browsing, image display on small devices, and perceptual compression. This thesis proposes a novel evaluation framework that analyses the performance of popular existing object proposal generators in detecting the most salient objects. This work also shows that, by incorporating saliency constraints, the number of generated object proposals and thus the computational cost can be decreased significantly for a target true positive detection rate (TPR). As part of the proposed framework, salient ground-truth masks are generated from the given original ground-truth masks for a given dataset. Given an object detection dataset, this work constructs salient object location ground-truth data, referred to here as salient ground-truth data for short, that only denotes the locations of salient objects. This is obtained by first computing a saliency map for the input image and then using it to assign a saliency score to each object in the image. Objects whose saliency scores are sufficiently high are referred to as salient objects. The detection rates are analyzed for existing object proposal generators with respect to the original ground-truth masks and the generated salient ground-truth masks. As part of this work, a salient object detection database with salient ground-truth masks was constructed from the PASCAL VOC 2007 dataset. Not only does this dataset aid in analyzing the performance of existing object detectors for salient object detection, but it also helps in the development of new object detection methods and evaluating their performance in terms of successful detection of salient objects.Dissertation/ThesisMasters Thesis Electrical Engineering 201

    Mobile Map Browsers: Anticipated User Interaction for Data Pre-fetching

    Get PDF
    When browsing a graphical display of geospatial data on mobile devices, users typically change the displayed maps by panning, zooming in and out, or rotating the device. Limited storage space on mobile devices and slow wireless communications, however, impede the performance of these operations. To overcome the bottleneck that all map data to be displayed on the mobile device need to be downloaded on demand, this thesis investigates how anticipated user interactions affect intelligent pre-fetching so that an on-demand download session is extended incrementally. User interaction is defined as a set of map operations that each have corresponding effects on the spatial dataset required to generate the display. By anticipating user interaction based on past behavior and intuition on when waiting for data is acceptable, it is possible to device a set of strategies to better prepare the device with data for future use. Users that engage with interactive map displays for a variety of tasks, whether it be navigation, information browsing, or data collection, experience a dynamic display to accomplish their goal. With vehicular navigation, the display might update itself as a result of a GPS data stream reflecting movement through space. This movement is not random, especially as is the case of moving vehicles and, therefore, this thesis suggests that mobile map data could be pre-fetched in order to improve usability. Pre-fetching memory-demanding spatial data can benefit usability in several ways, but in particular it can (1) reduce latency when downloading data over wireless connections and (2) better prepare a device for situations where wireless internet connectivity is weak or intermittent. This thesis investigates mobile map caching and devises an algorithm for pre-fetching data on behalf of the application user. Two primary models are compared: isotropic (direction-independent) and anisotropic (direction-dependent) pre-fetching. A prefetching simulation is parameterized with many trajectories that vary in complexity (a metric of direction change within the trajectory) and it is shown that, although anisotropic pre-fetching typically results in a better pre-fetching accuracy, it is not ideal for all scenarios. This thesis suggests a combination of models to accommodate the significant variation in moving object trajectories. In addition, other methods for pre-fetching spatial data are proposed for future research

    Strategies for image visualisation and browsing

    Get PDF
    PhDThe exploration of large information spaces has remained a challenging task even though the proliferation of database management systems and the state-of-the art retrieval algorithms is becoming pervasive. Signi cant research attention in the multimedia domain is focused on nding automatic algorithms for organising digital image collections into meaningful structures and providing high-semantic image indices. On the other hand, utilisation of graphical and interactive methods from information visualisation domain, provide promising direction for creating e cient user-oriented systems for image management. Methods such as exploratory browsing and query, as well as intuitive visual overviews of image collection, can assist the users in nding patterns and developing the understanding of structures and content in complex image data-sets. The focus of the thesis is combining the features of automatic data processing algorithms with information visualisation. The rst part of this thesis focuses on the layout method for displaying the collection of images indexed by low-level visual descriptors. The proposed solution generates graphical overview of the data-set as a combination of similarity based visualisation and random layout approach. Second part of the thesis deals with problem of visualisation and exploration for hierarchical organisation of images. Due to the absence of the semantic information, images are considered the only source of high-level information. The content preview and display of hierarchical structure are combined in order to support image retrieval. In addition to this, novel exploration and navigation methods are proposed to enable the user to nd the way through database structure and retrieve the content. On the other hand, semantic information is available in cases where automatic or semi-automatic image classi ers are employed. The automatic annotation of image items provides what is referred to as higher-level information. This type of information is a cornerstone of multi-concept visualisation framework which is developed as a third part of this thesis. This solution enables dynamic generation of user-queries by combining semantic concepts, supported by content overview and information ltering. Comparative analysis and user tests, performed for the evaluation of the proposed solutions, focus on the ways information visualisation a ects the image content exploration and retrieval; how e cient and comfortable are the users when using di erent interaction methods and the ways users seek for information through di erent types of database organisation

    Enhancing person annotation for personal photo management using content and context based technologies

    Get PDF
    Rapid technological growth and the decreasing cost of photo capture means that we are all taking more digital photographs than ever before. However, lack of technology for automatically organising personal photo archives has resulted in many users left with poorly annotated photos, causing them great frustration when such photo collections are to be browsed or searched at a later time. As a result, there has recently been significant research interest in technologies for supporting effective annotation. This thesis addresses an important sub-problem of the broad annotation problem, namely "person annotation" associated with personal digital photo management. Solutions to this problem are provided using content analysis tools in combination with context data within the experimental photo management framework, called β€œMediAssist”. Readily available image metadata, such as location and date/time, are captured from digital cameras with in-built GPS functionality, and thus provide knowledge about when and where the photos were taken. Such information is then used to identify the "real-world" events corresponding to certain activities in the photo capture process. The problem of enabling effective person annotation is formulated in such a way that both "within-event" and "cross-event" relationships of persons' appearances are captured. The research reported in the thesis is built upon a firm foundation of content-based analysis technologies, namely face detection, face recognition, and body-patch matching together with data fusion. Two annotation models are investigated in this thesis, namely progressive and non-progressive. The effectiveness of each model is evaluated against varying proportions of initial annotation, and the type of initial annotation based on individual and combined face, body-patch and person-context information sources. The results reported in the thesis strongly validate the use of multiple information sources for person annotation whilst emphasising the advantage of event-based photo analysis in real-life photo management systems
    • …
    corecore